Arc-based capsids and uses thereof

ABSTRACT

Disclosed herein, in certain embodiments, are recombinant Arc and endogenous Gag polypeptides, and methods of using recombinant Arc and endogenous Gag polypeptides.

CROSS REFERENCE

This application is a continuation of U.S. patent application Ser. No.17/277,119, filed Mar. 17, 2021, which is a national phase entry ofInternational Application No. PCT/US2019/051786, filed Sep. 18, 2019,which claims the benefit of U.S. Provisional Patent Application No.62/733,015, filed Sep. 18, 2018, each of which is incorporated herein byreference in its entirety.

SEQUENCE LISTING STATEMENT

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Jan. 15, 2020, isnamed 54838_702_304_SL.txt and is 148,382 bytes in size.

SUMMARY OF THE DISCLOSURE

Disclosed herein, in certain embodiments, are recombinant and engineeredArc polypeptides and recombinant and engineered endogenous Gag(endo-Gag) polypeptides. In some embodiments, also included areArc-based capsids and endo-Gag based capsids, either loaded or empty,and methods of preparing the capsids. Additionally included are methodsof delivery of the Arc-based capsids and endo-Gag-based capsids to asite of interest.

Disclosed herein, in certain embodiments, is a capsid comprising arecombinant Arc polypeptide or a recombinant endogenous Gag polypeptideand a therapeutic agent. In some embodiments, the therapeutic agent is anucleic acid. In some embodiments, the nucleic acid is an RNA. In someembodiments, the recombinant Arc polypeptide is a human Arc polypeptidecomprising an amino acid sequence that is SEQ ID NO: 1 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 1. In someembodiments, the recombinant Arc polypeptide is an Arc polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 2 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 2; b) anamino acid sequence that is SEQ ID NO: 3 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 3; c) an amino acid sequencethat is SEQ ID NO: 4 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 4; d) an amino acid sequence that is SEQ IDNO: 5 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 5; e) an amino acid sequence that is SEQ ID NO: 6 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 6; f) anamino acid sequence that is SEQ ID NO: 7 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 7; g) an amino acid sequencethat is SEQ ID NO: 8 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 8; h) an amino acid sequence that is SEQ IDNO: 9 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 9; i) an amino acid sequence that is SEQ ID NO: 10 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 10;or j) an amino acid sequence that is SEQ ID NO: 11 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 11; or k) anamino acid sequence that is SEQ ID NO: 12 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 12; or l) an amino acidsequence that is SEQ ID NO: 13 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 13; or m) an amino acid sequencethat is SEQ ID NO: 14 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 14; or n) an amino acid sequence that is SEQID NO: 15 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 15. In some embodiments, the recombinant endogenous Gagpolypeptide is a human endogenous Gag polypeptide. In some embodiments,the recombinant endogenous Gag polypeptide is an endogenous Gagpolypeptide comprising: a) an amino acid sequence that is SEQ ID NO: 16or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 16; b) an amino acid sequence that is SEQ ID NO: 17 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 17; c) anamino acid sequence that is SEQ ID NO: 18 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 18; d) an amino acidsequence that is SEQ ID NO: 19 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 19; e) an amino acid sequence thatis SEQ ID NO: 20 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 20; f) an amino acid sequence that is SEQ IDNO: 21 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 21; or g) an amino acid sequence that is SEQ ID NO: 22 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 22;or h) an amino acid sequence that is SEQ ID NO: 23 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 23; or i) anamino acid sequence that is SEQ ID NO: 24 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 24; or j) an amino acidsequence that is SEQ ID NO: 25 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 25; or k) an amino acid sequencethat is SEQ ID NO: 26 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 26; or l) an amino acid sequence that is SEQID NO: 27 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 27 or m) an amino acid sequence that is SEQ ID NO: 28 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:28.

Disclosed herein, in certain embodiments, is a capsid comprising arecombinant Arc polypeptide or a recombinant endogenous Gag polypeptide,wherein the recombinant Arc polypeptide is not a rat Arc polypeptide ora human Arc polypeptide. In some embodiments, the capsid furthercomprises a cargo. In some embodiments, the cargo is a nucleic acid. Insome embodiments, the cargo is an RNA. In some embodiments, the cargo isa therapeutic agent. In some embodiments, the recombinant Arcpolypeptide is an Arc polypeptide comprising: a) an amino acid sequencethat is SEQ ID NO: 2 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 2; b) an amino acid sequence that is SEQ IDNO: 3 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 3; c) an amino acid sequence that is SEQ ID NO: 4 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 4; d) anamino acid sequence that is SEQ ID NO: 5 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 5; e) an amino acid sequencethat is SEQ ID NO: 6 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 6; f) an amino acid sequence that is SEQ IDNO: 7 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 7; g) an amino acid sequence that is SEQ ID NO: 8 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 8; h) anamino acid sequence that is SEQ ID NO: 9 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 9; i) an amino acid sequencethat is SEQ ID NO: 10 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 10; or j) an amino acid sequence that is SEQID NO: 11 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 11; or k) an amino acid sequence that is SEQ ID NO: 12 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:12; or l) an amino acid sequence that is SEQ ID NO: 13 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 13; or m) anamino acid sequence that is SEQ ID NO: 14 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 14; or n) an amino acidsequence that is SEQ ID NO: 15 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 15. In some embodiments, therecombinant endogenous Gag polypeptide is an endogenous Gag polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 16 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 16; b) anamino acid sequence that is SEQ ID NO: 17 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 17; c) an amino acidsequence that is SEQ ID NO: 18 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 18; d) an amino acid sequence thatis SEQ ID NO: 19 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 19; e) an amino acid sequence that is SEQ IDNO: 20 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 20; f) an amino acid sequence that is SEQ ID NO: 21 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 21;or g) an amino acid sequence that is SEQ ID NO: 22 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 22; or h) anamino acid sequence that is SEQ ID NO: 23 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 23; or i) an amino acidsequence that is SEQ ID NO: 24 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 24; or j) an amino acid sequencethat is SEQ ID NO: 25 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 25; or k) an amino acid sequence that is SEQID NO: 26 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 26; or l) an amino acid sequence that is SEQ ID NO: 27 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:27 or m) an amino acid sequence that is SEQ ID NO: 28 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 28.

Disclosed herein, in certain embodiments, is a vector comprising DNAencoding a recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide. In some embodiments, the vector further encodes atherapeutic agent. In some embodiments, the therapeutic agent is anucleic acid. In some embodiments, the nucleic acid is an RNA. In someembodiments, the recombinant Arc polypeptide is a human Arc polypeptidecomprising an amino acid sequence that is SEQ ID NO: 1 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 1. In someembodiments, the recombinant Arc polypeptide is an Arc polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 2 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 2; b) anamino acid sequence that is SEQ ID NO: 3 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 3; c) an amino acid sequencethat is SEQ ID NO: 4 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 4; d) an amino acid sequence that is SEQ IDNO: 5 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 5; e) an amino acid sequence that is SEQ ID NO: 6 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 6; f) anamino acid sequence that is SEQ ID NO: 7 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 7; g) an amino acid sequencethat is SEQ ID NO: 8 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 8; h) an amino acid sequence that is SEQ IDNO: 9 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 9; i) an amino acid sequence that is SEQ ID NO: 10 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 10;or j) an amino acid sequence that is SEQ ID NO: 11 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 11; or k) anamino acid sequence that is SEQ ID NO: 12 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 12; or l) an amino acidsequence that is SEQ ID NO: 13 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 13; or m) an amino acid sequencethat is SEQ ID NO: 14 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 14; or n) an amino acid sequence that is SEQID NO: 15 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 15. In some embodiments, the recombinant endogenous Gagpolypeptide is a human endogenous Gag polypeptide. In some embodiments,the recombinant endogenous Gag polypeptide is an endogenous Gagpolypeptide comprising: a) an amino acid sequence that is SEQ ID NO: 16or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 16; b) an amino acid sequence that is SEQ ID NO: 17 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 17; c) anamino acid sequence that is SEQ ID NO: 18 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 18; d) an amino acidsequence that is SEQ ID NO: 19 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 19; e) an amino acid sequence thatis SEQ ID NO: 20 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 20; f) an amino acid sequence that is SEQ IDNO: 21 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 21; or g) an amino acid sequence that is SEQ ID NO: 22 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 22;or h) an amino acid sequence that is SEQ ID NO: 23 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 23; or i) anamino acid sequence that is SEQ ID NO: 24 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 24; or j) an amino acidsequence that is SEQ ID NO: 25 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 25; or k) an amino acid sequencethat is SEQ ID NO: 26 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 26; or l) an amino acid sequence that is SEQID NO: 27 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 27 or m) an amino acid sequence that is SEQ ID NO: 28 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:28.

Disclosed herein, in certain embodiments, is a vector comprising DNAencoding a recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide, wherein the recombinant Arc polypeptide is not a rat Arcpolypeptide or a human Arc polypeptide. In some embodiments, the vectorfurther encodes a cargo. In some embodiments, the cargo is a nucleicacid. In some embodiments, the cargo is an RNA. In some embodiments, thecargo is a therapeutic agent. In some embodiments, the recombinant Arcpolypeptide is an Arc polypeptide comprising: a) an amino acid sequencethat is SEQ ID NO: 2 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 2; b) an amino acid sequence that is SEQ IDNO: 3 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 3; c) an amino acid sequence that is SEQ ID NO: 4 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 4; d) anamino acid sequence that is SEQ ID NO: 5 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 5; e) an amino acid sequencethat is SEQ ID NO: 6 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 6; f) an amino acid sequence that is SEQ IDNO: 7 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 7; g) an amino acid sequence that is SEQ ID NO: 8 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 8; h) anamino acid sequence that is SEQ ID NO: 9 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 9; i) an amino acid sequencethat is SEQ ID NO: 10 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 10; or j) an amino acid sequence that is SEQID NO: 11 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 11; or k) an amino acid sequence that is SEQ ID NO: 12 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:12; or l) an amino acid sequence that is SEQ ID NO: 13 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 13; or m) anamino acid sequence that is SEQ ID NO: 14 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 14; or n) an amino acidsequence that is SEQ ID NO: 15 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 15. In some embodiments, therecombinant endogenous Gag polypeptide is an endogenous Gag polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 16 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 16; b) anamino acid sequence that is SEQ ID NO: 17 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 17; c) an amino acidsequence that is SEQ ID NO: 18 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 18; d) an amino acid sequence thatis SEQ ID NO: 19 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 19; e) an amino acid sequence that is SEQ IDNO: 20 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 20; f) an amino acid sequence that is SEQ ID NO: 21 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 21;or g) an amino acid sequence that is SEQ ID NO: 22 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 22; or h) anamino acid sequence that is SEQ ID NO: 23 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 23; or i) an amino acidsequence that is SEQ ID NO: 24 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 24; or j) an amino acid sequencethat is SEQ ID NO: 25 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 25; or k) an amino acid sequence that is SEQID NO: 26 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 26; or l) an amino acid sequence that is SEQ ID NO: 27 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:27 or m) an amino acid sequence that is SEQ ID NO: 28 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 28.

Disclosed herein, in certain embodiments, is a method of delivering acargo to a cell comprising administering to the cell a capsid comprisinga recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide and a therapeutic agent. In some embodiments, thetherapeutic agent is a nucleic acid. In some embodiments, the nucleicacid is an RNA. In some embodiments, the recombinant Arc polypeptide isa human Arc polypeptide comprising an amino acid sequence that is SEQ IDNO: 1 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 1. In some embodiments, the recombinant Arc polypeptide is anArc polypeptide comprising: a) an amino acid sequence that is SEQ ID NO:2 or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 2; b) an amino acid sequence that is SEQ ID NO: 3 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 3; c) an aminoacid sequence that is SEQ ID NO: 4 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 4; d) an amino acid sequence thatis SEQ ID NO: 5 or an amino acid sequence that is at least 90% identicalto the SEQ ID NO: 5; e) an amino acid sequence that is SEQ ID NO: 6 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:6; f) an amino acid sequence that is SEQ ID NO: 7 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 7; g) an aminoacid sequence that is SEQ ID NO: 8 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 8; h) an amino acid sequence thatis SEQ ID NO: 9 or an amino acid sequence that is at least 90% identicalto the SEQ ID NO: 9; i) an amino acid sequence that is SEQ ID NO: 10 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:10; or j) an amino acid sequence that is SEQ ID NO: 11 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 11; or k) anamino acid sequence that is SEQ ID NO: 12 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 12; or l) an amino acidsequence that is SEQ ID NO: 13 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 13; or m) an amino acid sequencethat is SEQ ID NO: 14 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 14; or n) an amino acid sequence that is SEQID NO: 15 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 15. In some embodiments, the recombinant endogenous Gagpolypeptide is a human endogenous Gag polypeptide. In some embodiments,the recombinant endogenous Gag polypeptide is an endogenous Gagpolypeptide comprising: a) an amino acid sequence that is SEQ ID NO: 16or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 16; b) an amino acid sequence that is SEQ ID NO: 17 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 17; c) anamino acid sequence that is SEQ ID NO: 18 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 18; d) an amino acidsequence that is SEQ ID NO: 19 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 19; e) an amino acid sequence thatis SEQ ID NO: 20 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 20; f) an amino acid sequence that is SEQ IDNO: 21 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 21; or g) an amino acid sequence that is SEQ ID NO: 22 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 22;or h) an amino acid sequence that is SEQ ID NO: 23 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 23; or i) anamino acid sequence that is SEQ ID NO: 24 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 24; or j) an amino acidsequence that is SEQ ID NO: 25 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 25; or k) an amino acid sequencethat is SEQ ID NO: 26 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 26; or l) an amino acid sequence that is SEQID NO: 27 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 27 or m) an amino acid sequence that is SEQ ID NO: 28 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:28. In some embodiments, the cell is a eukaryotic cell. In someembodiments, the cell is a vertebrate cell. In some embodiments, thecell is a mammalian cell. In some embodiments, the cell is a human cell.In some embodiments, the cargo is a nucleic acid. In some embodiments,the cell expresses a gene encoded by the nucleic acid. In someembodiments, the cargo is a therapeutic agent.

Disclosed herein, in certain embodiments, is a method of delivering acargo to a cell comprising administering to the cell a capsid comprisinga recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide, wherein the recombinant Arc polypeptide is not a rat Arcpolypeptide or a human Arc polypeptide. In some embodiments, the capsidfurther comprises a cargo. In some embodiments, the cargo is a nucleicacid. In some embodiments, the cargo is an RNA. In some embodiments, thecargo is a therapeutic agent. In some embodiments, the recombinant Arcpolypeptide is an Arc polypeptide comprising: a) an amino acid sequencethat is SEQ ID NO: 2 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 2; b) an amino acid sequence that is SEQ IDNO: 3 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 3; c) an amino acid sequence that is SEQ ID NO: 4 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 4; d) anamino acid sequence that is SEQ ID NO: 5 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 5; e) an amino acid sequencethat is SEQ ID NO: 6 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 6; f) an amino acid sequence that is SEQ IDNO: 7 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 7; g) an amino acid sequence that is SEQ ID NO: 8 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 8; h) anamino acid sequence that is SEQ ID NO: 9 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 9; i) an amino acid sequencethat is SEQ ID NO: 10 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 10; or j) an amino acid sequence that is SEQID NO: 11 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 11; or k) an amino acid sequence that is SEQ ID NO: 12 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:12; or l) an amino acid sequence that is SEQ ID NO: 13 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 13; or m) anamino acid sequence that is SEQ ID NO: 14 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 14; or n) an amino acidsequence that is SEQ ID NO: 15 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 15. In some embodiments, therecombinant endogenous Gag polypeptide is an endogenous Gag polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 16 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 16; b) anamino acid sequence that is SEQ ID NO: 17 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 17; c) an amino acidsequence that is SEQ ID NO: 18 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 18; d) an amino acid sequence thatis SEQ ID NO: 19 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 19; e) an amino acid sequence that is SEQ IDNO: 20 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 20; f) an amino acid sequence that is SEQ ID NO: 21 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 21;or g) an amino acid sequence that is SEQ ID NO: 22 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 22; or h) anamino acid sequence that is SEQ ID NO: 23 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 23; or i) an amino acidsequence that is SEQ ID NO: 24 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 24; or j) an amino acid sequencethat is SEQ ID NO: 25 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 25; or k) an amino acid sequence that is SEQID NO: 26 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 26; or l) an amino acid sequence that is SEQ ID NO: 27 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:27 or m) an amino acid sequence that is SEQ ID NO: 28 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 28. In someembodiments, the cell is a eukaryotic cell. In some embodiments, thecell is a vertebrate cell. In some embodiments, the cell is a mammaliancell. In some embodiments, the cell is a human cell. In someembodiments, the cargo is a nucleic acid. In some embodiments, the cellexpresses a gene encoded by the nucleic acid. In some embodiments, thecargo is a therapeutic agent.

Disclosed herein, in certain embodiments, is a method of transfecting anucleic acid into a cell comprising administering to the cell a capsidcomprising a recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide and a therapeutic agent. In some embodiments, thetherapeutic agent is a nucleic acid. In some embodiments, the nucleicacid is an RNA. In some embodiments, the recombinant Arc polypeptide isa human Arc polypeptide comprising an amino acid sequence that is SEQ IDNO: 1 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 1. In some embodiments, the recombinant Arc polypeptide is anArc polypeptide comprising: a) an amino acid sequence that is SEQ ID NO:2 or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 2; b) an amino acid sequence that is SEQ ID NO: 3 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 3; c) an aminoacid sequence that is SEQ ID NO: 4 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 4; d) an amino acid sequence thatis SEQ ID NO: 5 or an amino acid sequence that is at least 90% identicalto the SEQ ID NO: 5; e) an amino acid sequence that is SEQ ID NO: 6 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:6; f) an amino acid sequence that is SEQ ID NO: 7 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 7; g) an aminoacid sequence that is SEQ ID NO: 8 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 8; h) an amino acid sequence thatis SEQ ID NO: 9 or an amino acid sequence that is at least 90% identicalto the SEQ ID NO: 9; i) an amino acid sequence that is SEQ ID NO: 10 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:10; or j) an amino acid sequence that is SEQ ID NO: 11 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 11; or k) anamino acid sequence that is SEQ ID NO: 12 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 12; or l) an amino acidsequence that is SEQ ID NO: 13 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 13; or m) an amino acid sequencethat is SEQ ID NO: 14 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 14; or n) an amino acid sequence that is SEQID NO: 15 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 15. In some embodiments, the recombinant endogenous Gagpolypeptide is a human endogenous Gag polypeptide. In some embodiments,the recombinant endogenous Gag polypeptide is an endogenous Gagpolypeptide comprising: a) an amino acid sequence that is SEQ ID NO: 16or an amino acid sequence that is at least 90% identical to the SEQ IDNO: 16; b) an amino acid sequence that is SEQ ID NO: 17 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 17; c) anamino acid sequence that is SEQ ID NO: 18 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 18; d) an amino acidsequence that is SEQ ID NO: 19 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 19; e) an amino acid sequence thatis SEQ ID NO: 20 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 20; f) an amino acid sequence that is SEQ IDNO: 21 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 21; or g) an amino acid sequence that is SEQ ID NO: 22 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 22;or h) an amino acid sequence that is SEQ ID NO: 23 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 23; or i) anamino acid sequence that is SEQ ID NO: 24 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 24; or j) an amino acidsequence that is SEQ ID NO: 25 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 25; or k) an amino acid sequencethat is SEQ ID NO: 26 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 26; or l) an amino acid sequence that is SEQID NO: 27 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 27 or m) an amino acid sequence that is SEQ ID NO: 28 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:28.

Disclosed herein, in certain embodiments, is a method of transfecting anucleic acid into a cell comprising administering to the cell a capsidcomprising a recombinant Arc polypeptide or a recombinant endogenous Gagpolypeptide, wherein the recombinant Arc polypeptide is not a rat Arcpolypeptide or a human Arc polypeptide. In some embodiments, the capsidfurther comprises a cargo. In some embodiments, the cargo is a nucleicacid. In some embodiments, the cargo is an RNA. In some embodiments, thecargo is a therapeutic agent. In some embodiments, the recombinant Arcpolypeptide is an Arc polypeptide comprising: a) an amino acid sequencethat is SEQ ID NO: 2 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 2; b) an amino acid sequence that is SEQ IDNO: 3 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 3; c) an amino acid sequence that is SEQ ID NO: 4 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 4; d) anamino acid sequence that is SEQ ID NO: 5 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 5; e) an amino acid sequencethat is SEQ ID NO: 6 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 6; f) an amino acid sequence that is SEQ IDNO: 7 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 7; g) an amino acid sequence that is SEQ ID NO: 8 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 8; h) anamino acid sequence that is SEQ ID NO: 9 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 9; i) an amino acid sequencethat is SEQ ID NO: 10 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 10; or j) an amino acid sequence that is SEQID NO: 11 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 11; or k) an amino acid sequence that is SEQ ID NO: 12 oran amino acid sequence that is at least 90% identical to the SEQ ID NO:12; or l) an amino acid sequence that is SEQ ID NO: 13 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 13; or m) anamino acid sequence that is SEQ ID NO: 14 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 14; or n) an amino acidsequence that is SEQ ID NO: 15 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 15. In some embodiments, therecombinant endogenous Gag polypeptide is an endogenous Gag polypeptidecomprising: a) an amino acid sequence that is SEQ ID NO: 12 or an aminoacid sequence that is at least 90% identical to the SEQ ID NO: 12; b) anamino acid sequence that is SEQ ID NO: 13 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 13; c) an amino acidsequence that is SEQ ID NO: 14 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 14; d) an amino acid sequence thatis SEQ ID NO: 15 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 15; e) an amino acid sequence that is SEQ IDNO: 16 or an amino acid sequence that is at least 90% identical to theSEQ ID NO: 16; f) an amino acid sequence that is SEQ ID NO: 17 or anamino acid sequence that is at least 90% identical to the SEQ ID NO: 17;g) an amino acid sequence that is SEQ ID NO: 18 or an amino acidsequence that is at least 90% identical to the SEQ ID NO: 18; g) anamino acid sequence that is SEQ ID NO: 19 or an amino acid sequence thatis at least 90% identical to the SEQ ID NO: 19; g) an amino acidsequence that is SEQ ID NO: 20 or an amino acid sequence that is atleast 90% identical to the SEQ ID NO: 20; g) an amino acid sequence thatis SEQ ID NO: 21 or an amino acid sequence that is at least 90%identical to the SEQ ID NO: 21; or h) an amino acid sequence that is SEQID NO: 22 or an amino acid sequence that is at least 90% identical tothe SEQ ID NO: 22.

Disclosed herein, in certain embodiments, is an engineered Arc orendo-Gag polypeptide comprising a cargo binding domain and at least onecapsid forming subunit from an Arc or endo-Gag polypeptide. In someembodiments, the cargo binding domain comprises a nucleic acid bindingdomain. In some embodiments, the cargo binding domain comprises apolypeptide that binds to a small molecule. In some embodiments, thecargo binding domain comprises a polypeptide that binds to a protein, apeptide, or an antibody or binding fragment thereof. In someembodiments, the cargo binding domain comprises a polypeptide that bindsto a peptidomimetic or a nucleotidomimetic. In some embodiments, the atleast one capsid forming subunit comprises a polypeptide thatcorresponds to the CA N-lobe and/or CA C-lobe of SEQ ID NO: 1. In someembodiments, the engineered Arc or endo-Gag polypeptide furthercomprises a second capsid forming subunit from a different species of anArc or endo-Gag polypeptide. In some embodiments, the second capsidforming subunit comprises a polypeptide that corresponds to the N-lobeand/or C-lobe of SEQ ID NO: 1. In some embodiments, the at least onecapsid forming subunit and the second capsid forming subunit are eachindependently selected from a species of Arc or endo-Gag selected from amammal, a rodent, a bird, a reptile, a fish, an insect, a fungus, or aplant. In some embodiments, the at least one capsid forming subunit andthe second capsid forming subunit are from two different species. Insome embodiments, the cargo binding domain is fused either directly orvia a linker to the C-terminus of the at least one capsid formingsubunit. In some embodiments, the cargo binding domain is fused eitherdirectly or via a linker to the N-terminus of the at least one capsidforming subunit. In some embodiments, the second capsid forming subunitis fused either directly or via a linker to the C-terminus of the atleast one capsid forming subunit. In some embodiments, the second capsidforming subunit is fused either directly or via a linker to theN-terminus of the at least one capsid forming subunit. In someembodiments, the cargo binding domain is fused either directly or via alinker to the N-terminus of the at least one capsid forming subunit andthe second capsid forming subunit is fused either directly or via alinker to the C-terminus of the at least one capsid forming subunit. Insome embodiments, the cargo binding domain is fused either directly orvia a linker to the C-terminus of the at least one capsid formingsubunit and the second capsid forming subunit is fused either directlyor via a linker to the N-terminus of the at least one capsid formingsubunit. In some embodiments, the engineered Arc or endo-Gag polypeptidefurther comprises a second polypeptide. In some embodiments, the secondpolypeptide is fused either directly or via a linker to the at least onecapsid forming subunit. In some embodiments, the second polypeptide isfused either directly or via a linker to the cargo binding domain. Insome embodiments, the second polypeptide is a protein or an antibody orits binding fragments thereof. In some embodiments, the protein is ahuman protein or a viral protein. In some embodiments, the protein is ahuman Gag-like protein. In some embodiments, the protein is a de novoengineered protein designed to bind to a target receptor of interest. Insome embodiments, the second polypeptide guides the delivery of a capsidformed by the engineered Arc or endo-Gag polypeptide to a target site ofinterest.

Disclosed herein, in certain embodiments, is a truncated Arc or endo-Gagpolypeptide wherein a portion that is not involved withcapsid-formation, nucleic acid binding, or delivery is removed. In someembodiments, the portion comprises a matrix (MA) domain, a reversetranscriptase (RT) domain, a nucleotide binding domain, or a combinationthereof, provided that the nucleotide binding domain is not a human ArcRNA binding domain. In some embodiments, the portion comprises a CAC-lobe domain. In some embodiments, the portion comprises an N-terminaldeletion, a C-terminal deletion, or a combination thereof. In someembodiments, the N-terminal deletion comprises a deletion of up to 10amino acids, 20 amino acids, 30 amino acids, or 50 amino acids. In someembodiments, the C-terminal deletion comprises a deletion of up to 10amino acids, 20 amino acids, 30 amino acids, or 50 amino acids.

Disclosed herein, in certain embodiments, is an Arc or endo-Gag-basedcapsid comprising an engineered Arc or endo-Gag polypeptide which may bea truncated Arc or endo-Gag polypeptide and a cargo encapsulated by thecapsid formed by the engineered Arc or endo-Gag polypeptide. In someembodiments, the cargo is a nucleic acid molecule. In some embodiments,the nucleic acid molecule is DNA, RNA, or a mixture of DNA and RNA. Insome embodiments, the DNA and the RNA are each independentlysingle-stranded, double-stranded, or a mixture of single and doublestranded. In some embodiments, the cargo is a small molecule. In someembodiments, the cargo is a protein. In some embodiments, the cargo is apeptide. In some embodiments, the cargo is an antibody or bindingfragments thereof. In some embodiments, the cargo is a peptidomimetic ora nucleotidomimetic. In some embodiments, the Arc or endo-Gag-basedcapsid comprises one or more additional capsid subunits from one or morespecies of Arc or endo-Gag proteins that are different than theengineered Arc or endo-Gag polypeptide. In some embodiments, theArc-based or endo-Gag-based capsid comprises one or more additionalcapsid subunits from non-Arc proteins. In some embodiments, the one ormore additional capsid subunits comprise Copia protein, ASPRV1 protein,a protein from the SCAN domain family, a protein encoded by theParaneoplastic Ma antigen family (e.g. PNMA5, PNMA6, PNMA6A, andPNMA6B), a protein from the retrotransposon Gag-like family (e.g. RTL3,RTL6, RTL8A, RTL8B), or a combination thereof. In some embodiments, theone or more additional capsid subunits comprise BOP, LDOC1, MOAP1,PEG10, PNMA3, PNMA5, PNMA6A, PNMA6B, RTL3, RTL6, RTL8A, RTL8B, andZNF18. In some embodiments, the capsid has a diameter of at least 1 nm,2 nm, 3 nm, 4 nm, 5 nm, 10 nm, 15 nm, 20 nm, 25 nm, 30 nm, 50 nm, 80 nm,100 nm, 120 nm, 150 nm, 200 nm, 250 nm, 300 nm, 500 nm, 600 nm, or more.In some embodiments, the capsid has a diameter of from about 1 nm toabout 600 nm, from about 1 nm to about 500 nm, from about 1 nm to about200 nm, from about 1 nm to about 100 nm, from about 1 nm to about 50 nm,or from about 1 nm to about 30 nm. In some embodiments, the capsid has areduced off-target effect. In some embodiments, the capsid does not havean off-target effect. In some embodiments, the capsid is formed ex-vivo.In some embodiments, the capsid is formed in-vitro.

Disclosed herein, in certain embodiments, is a nucleic acid polymerencoding a recombinant or engineered Arc polypeptide or a recombinant orengineered endogenous Gag polypeptide described herein.

Disclosed herein, in certain embodiments, is a vector comprising anucleic acid polymer encoding a recombinant or engineered Arcpolypeptide or a recombinant or engineered endogenous Gag polypeptidedescribed herein.

Disclosed herein, in certain embodiments, is a method of preparing aloaded Arc-based or endo-Gag-based capsid comprising: incubating aplurality of recombinant or engineered Arc polypeptides or a pluralityof recombinant or engineered endo-Gag polypeptides with a cargo in asolution for a time sufficient to generate the loaded capsid. In someembodiments, the method further comprises mixing the solution comprisingthe plurality of engineered Arc or endo-Gag polypeptides with aplurality of non-Arc or non-endo-Gag capsid forming subunits prior toincubating with the cargo. In some embodiments, the plurality of non-Arcor non-endo-Gag capsid forming subunits are mixed with the plurality ofrecombinant or engineered Arc or endo-Gag polypeptides at a ratio of1:1, 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, or 10:1. In someembodiments, the plurality of non-Arc or non-endo-Gag capsid formingsubunits are mixed with the plurality of engineered Arc or endo-Gagpolypeptides at a ratio of 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, or1:10. In some embodiments, the method further comprises mixing thesolution comprising the plurality of truncated Arc or endo-Gagpolypeptides with a plurality of non-Arc or endo-Gag capsid formingsubunits prior to incubating with the cargo. In some embodiments, theplurality of non-Arc or endo-Gag capsid forming subunits are mixed withthe plurality of truncated Arc or endo-Gag polypeptides at a ratio of1:1, 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, or 10:1. In someembodiments, the plurality of non-Arc or non-endo-Gag capsid formingsubunits are mixed with the plurality of truncated Arc or endo-Gagpolypeptides at a ratio of 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, or1:10. In some embodiments, the plurality of engineered Arc or endo-Gagpolypeptides is obtained from a bacterial cell system, an insect cellsystem, or a mammalian cell system. In some embodiments, the pluralityof engineered Arc or endo-Gag polypeptides is obtained from a cell-freesystem. In some embodiments, the plurality of truncated Arc or endo-Gagpolypeptides is obtained from a bacterial cell system, an insect cellsystem, or a mammalian cell system. In some embodiments, the pluralityof truncated Arc or endo-Gag polypeptides is obtained from a cell-freesystem. In some embodiments, the loaded Arc-based or endo-Gag capsid isformulated for systemic administration. In some embodiments, the loadedArc or endo-Gag-based capsid is formulated for local administration. Insome embodiments, the loaded Arc or endo-Gag-based capsid is formulatedfor parenteral administration. In some embodiments, the loaded Arc orendo-Gag-based capsid is formulated for oral administration. In someembodiments, the loaded Arc or endo-Gag-based capsid is formulated fortopical administration. In some embodiments, the loaded Arc orendo-Gag-based capsid is formulated for sublingual or aerosoladministration.

Disclosed herein, in certain embodiments, is use of an engineered orrecombinant Arc-based or endo-Gag-based capsid for delivery of a cargoto a site of interest, comprising contacting a cell at the site ofinterest with an Arc-based or endo-Gag-based capsid for a timesufficient to facilitate cellular uptake of the capsid. In someembodiments, the cell is a tumor cell. In some embodiments, the tumorcell is a solid tumor cell. In some embodiments, the solid tumor cell isa cell from a bladder cancer, breast cancer, brain cancer, colorectalcancer, kidney cancer, liver cancer, lung cancer, pancreatic cancer,prostate cancer, skin cancer, stomach cancer, or thyroid cancer. In someembodiments, the tumor cell is from a hematologic malignancy. In someembodiments, the hematologic malignancy is a B-cell malignancy, or aT-cell malignancy. In some embodiments, the hematologic malignancy ischronic lymphocytic leukemia (CLL), small lymphocytic lymphoma (SLL),diffuse large B cell lymphoma (DLBCL), follicular lymphoma, mantle celllymphoma, Burkitt lymphoma, cutaneous T-cell lymphoma, or peripheral Tcell lymphoma. In some embodiments, the cell is a somatic cell. In someembodiments, the cell is a stem cell or a progenitor cell. In someembodiments, the cell is a mesenchymal stem or progenitor cell. In someembodiments, the cell is a hematopoietic stem or progenitor cell. Insome embodiments, the cell is a muscle cell, a skin cell, a blood cell,or an immune cell. In some embodiments, a target protein isoverexpressed or is depleted in the cell. In some embodiments, a targetgene in the cell has one or more mutations. In some embodiments, thecell comprises an impaired splicing mechanism. In some embodiments, theuse is an in vivo use. In some embodiments, the Arc-based capsid isadministered systemically to a subject. In some embodiments, theArc-based or endo-Gag-based capsid is administered via localadministration to a subject. In some embodiments, the Arc-based orendo-Gag-based capsid is administered parenterally to a subject. In someembodiments, the Arc-based capsid is administered orally to a subject.In some embodiments, the Arc-based or endo-Gag-based capsid isadministered topically to a subject. In some embodiments, the Arc-basedor endo-Gag-based capsid is administered via sublingual or aerosoladministration to a subject. In some embodiments, the use is an in vitroor ex vivo use.

Disclosed herein, in certain embodiments, is a kit comprising anengineered Arc or endo-Gag polypeptide, a truncated Arc or endo-Gagpolypeptide, a vector encoding a recombinant or engineered Arc orendo-Gag polypeptide, or an Arc-based or endo-Gag-based capsid.

BRIEF DESCRIPTION OF THE DRAWINGS

Various aspects of the disclosure are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present disclosure will be obtained by reference tothe following detailed description that sets forth illustrativeembodiments, in which the principles of the disclosure are utilized, andthe accompanying drawings below.

FIG. 1 is a representation of exemplary Arc polypeptides.

FIG. 2 is a representation of exemplary engineered Arc polypeptides.

FIGS. 3A and 3B illustrate an exemplary method of engineering an Arcpolypeptide to carry a specific cargo (FIG. 3A) (e.g., an RNA payload),or remove an off-function effect (FIG. 3B).

FIG. 4A shows the isolation of 6×His-tagged human Arc by elution from aHisTrap column with an imidazole gradient.

FIG. 4B shows the separation of 6×His-tagged human Arc from residualnucleic acids on a mono Q column eluted with a NaCl gradient.

FIG. 5 shows a transmission electron microscope image of negativelystained human Arc capsids.

FIG. 6 shows transmission electron microscope images of negativelystained capsids formed from recombinantly expressed Arc orthologs.

FIG. 7 shows transmission electron microscope images of negativelystained capsids formed from recombinantly expressed endo-Gag proteins.

FIG. 8 shows selective internalization of Alexa594-labeled Arc capsidsby HeLa cells.

FIG. 9 shows the delivery of Cre RNA to HeLa cells by Arc capsids.

FIG. 10 illustrates methods for screening Arc and endo-Gag genecandidates for the ability to transmit a heterologous RNA payload.

DETAILED DESCRIPTION OF THE DISCLOSURE

Administrating diagnostic or therapeutic agents to a site of interestwith precision has presented an ongoing challenge. Available methods ofdelivering nucleic acids to cells have myriad limitations. For example,AAV viral vectors often used for gene therapy are immunogenic, have alimited payload capacity of <3 kb, suffer from poor bio-distribution,can only be administered by direct injection, and pose a risk ofdisrupting host genes by integration. Non-viral methods have differentlimitations. Liposomes are primarily delivered to the liver.Extracellular vesicles have a limited payload capacity of <1 kb, limitedscalability, and purification difficulties. Thus, there is a recognizedneed for new methods of delivering therapeutic payloads.

Most molecules do not possess inherent affinity in the body. In othercases, the administered agents accumulate either in the liver and thekidney for clearance or in unintended tissue or cell types. Method forimproving delivery includes coating the agent of choice with hydrophobiccompounds or polymers. Such an approach increases the duration of saidagent in circulation and augments hydrophobicity for cellular uptake. Onthe other hand, this approach does not actively direct cargo to the siteof interest for delivery.

To specifically target sites where therapy is needed, therapeuticcompounds are optionally fused to moieties such as ligands, antibodies,and aptamers that recognize and bind to receptors displayed on thesurface of targeted cells. Upon reaching a cell of interest, thetherapeutic compound is optionally further delivered to an intracellulartarget. For example, a therapeutic RNA can be translated to a protein ifit comes into contact with a ribosome in the cytoplasm of the cell.

Arc (activity-regulated cytoskeleton-associated protein) regulates theendocytic trafficking of α-amino-3-hydroxy-5-methylisoxazole-4-propionicacid (AMPA) type glutamate receptors. Arc activities have been linked tosynaptic strength and neuronal plasticity. Phenotypes of loss of Arc inexperimental murine model included defective formation of long-termmemory and reduced neuronal activity and plasticity.

Arc exhibits similar molecular properties to retroviral Gag proteins.The Arc gene may have originated from the Ty3/gypsy retrotransposon. Anendogenous Gag (endo-Gag) protein is any protein endogenous to aeukaryotic organism, including Arc, that has predicted and annotatedsimilarity to viral Gag proteins. Exemplary endo-Gag proteins aredisclosed in Campillos M, Doerks T, Shah P K, and Bork P, Computationalcharacterization of multiple Gag-like human proteins, Trends Genet. 2006November; 22(11):585-9. An endo-Gag protein is optionally recombinantlyexpressed by any host cell, including a prokaryotic or eukaryotic cell,or a bacterial, yeast, insect, vertebrate, mammalian, or human cell. Asdescribed herein, in some embodiments an endo-Gag protein assembles intoan endo-Gag capsid.

Disclosed herein, in certain embodiments, are Arc and endo-Gagpolypeptides which assemble into a capsid for delivery of a cargo ofinterest. In some embodiments, also described herein are engineered Arcand endo-Gag polypeptides which assemble into a capsid for delivery of acargo of interest. In additional embodiments, described herein arecapsids, e.g., Arc-based or endo-Gag-based capsids, for delivery of acargo of interest.

Arc Polypeptides and Endogenous Gag Polypeptides

In certain embodiments, disclosed herein is an Arc polypeptide. Incertain embodiments, disclosed herein is an endo-Gag polypeptide. Itshould be understood that endo-Gag sequences are optional substitutesfor Arc sequences to form any type of engineered Arc polypeptidedescribed in this section.

In some instances, Arc is a non-human Arc polypeptide. In someinstances, the Arc polypeptide comprises a full-length Arc polypeptide(e.g., a full-length non-human Arc polypeptide). In other instances, theArc polypeptide comprises a fragment of non-human Arc, such as atruncated Arc polypeptide, that participates in the formation of acapsid. In additional instances, the Arc polypeptide comprises one ormore domains of a non-human Arc polypeptide, in which at least one ofthe domains participates in the formation of a capsid. In furtherinstances, the Arc polypeptide is a recombinant Arc polypeptide.

In some instances, endo-Gag is a non-human endo-Gag polypeptide. In someinstances, the endo-Gag polypeptide comprises a full-length endo-Gagpolypeptide (e.g., a full-length non-human endo-Gag polypeptide). Inother instances, the endo-Gag polypeptide comprises a fragment ofnon-human endo-Gag, such as a truncated endo-Gag polypeptide, thatparticipates in the formation of a capsid. In additional instances, theendo-Gag polypeptide comprises one or more domains of a non-humanendo-Gag polypeptide, in which at least one of the domains participatesin the formation of a capsid. In further instances, the endo-Gagpolypeptide is a recombinant endo-Gag polypeptide.

In some embodiments, the Arc is a human Arc polypeptide with at leastits RNA binding domain modified to bind to a cargo that is not native tothe human Arc. In some instances, the Arc polypeptide comprises afull-length human Arc polypeptide with at least its RNA binding domainmodified to bind to a cargo that is not native to the human Arc protein.In other instances, the Arc polypeptide comprises a human Arc fragmentcomprising modification(s) in at least its RNA binding domain. Inadditional instances, the Arc polypeptide comprises one or more domainsof a human Arc polypeptide, in which at least one of the domainsparticipates in the formation of a capsid and in which the RNA bindingdomain is modified to bind to a cargo that native human Arc protein doesnot bind to. In further instances, the Arc polypeptide is a recombinanthuman Arc polypeptide, with at least the RNA binding domain is modifiedto enable loading of a cargo that is not native to the human Arcprotein.

In some embodiments, the Endo-Gag is a human Endo-Gag polypeptide withat least its RNA binding domain modified to bind to a cargo that is notnative to the human endo-Gag. In some instances, the endo-Gagpolypeptide comprises a full-length human endo-Gag polypeptide with atleast its RNA binding domain modified to bind to a cargo that is notnative to the human endo-Gag protein. In other instances, the endo-Gagpolypeptide comprises a human endo-Gag fragment comprisingmodification(s) in at least its RNA binding domain to bind to a cargothat a native human endo-Gag protein does not bind to. In additionalinstances, the endo-Gag polypeptide comprises one or more domains of ahuman endo-Gag polypeptide, in which at least one of the domainsparticipates in the formation of a capsid and in which the RNA bindingdomain is modified to bind to a cargo that is not native to the humanendo-Gag protein. In further instances, the endo-Gag polypeptide is arecombinant human endo-Gag polypeptide, with at least the RNA bindingdomain is modified to enable loading of a cargo that is not native tothe human endo-Gag protein.

In some instances, the Arc or endo-Gag polypeptide is an engineered Arcor endo-Gag polypeptide. As used herein, an engineered polypeptide is arecombinant polypeptide that is not identical in sequence to a fulllength, wild-type polypeptide. In some instances, the engineered Arc orendo-Gag polypeptide comprises a fragment of an Arc or endo-Gagpolypeptide from a first species and at least an additional fragmentfrom an Arc or endo-Gag polypeptide of a second species. In some cases,the first Arc or endo-Gag polypeptide is selected from a kingdom memberof animalia, plantae, fungi, or protista. In some cases, the firstspecies is selected from a mammal, a rodent, a bird, a reptile, a fish,a vertebrate, a eukaryote, an insect, a fungus, or a plant. In somecases, the second Arc polypeptide is selected from a kingdom member ofanimalia, plantae, fungi, or protista that is the same or different thanthe first Arc or endo-Gag polypeptide. In some cases, the second speciesis selected from a mammal, a rodent, a bird, a reptile, a fish, avertebrate, a eukaryote, an insect, a fungus, or a plant that isdifferent from the first species.

In some embodiments, an exemplary mammalian Arc or endo-Gag protein forexpression as a recombinant or engineered Arc polypeptide is from thespecies Homo sapiens. Additional exemplary species of primate Arc orendo-Gag protein proteins for expression as a recombinant or engineeredArc polypeptide include: gorilla, Pongo abelii, Pan paniscus, Macacanemestrina, Chlorocebus sabaeus, Papio anubis, Rhinopithecus roxellana,Macaca fascicularis, Nomascus leucogenys, Callithrix jacchus, Aotusnancymaae, Cebus capucinus imitator, Saimiri boliviensis boliviensis,Otolemur garnettii, Macaca mulatta, and Macaca fascicularis.

An exemplary species list of rodent Arc or endo-Gag proteins forexpression as a recombinant or engineered Arc or endo-Gag polypeptideincludes: Fukomys damarensis, Microcebus murinus, Heterocephalus glaber,Propithecus coquereli, Marmota marmota marmota, Galeopterus variegatus,Cavia porcellus, Dipodomys ordii, Octodon degus, Castor canadensisNannospalax galili, Carlito syrichta, Chinchilla lanigera, Mus musculus,Ictidomys tridecemlineatus, Rattus norvegicus, Microtus ochrogaster,Otolemur garnettii, Meriones unguiculatus, Cricetulus griseus, Rattusnorvegicus, Neotoma lepida, Jaculus jaculus, Mustela putorius furo,Mesocricetus auratus, Tupaia chinensis, Cricetulus griseus,Chrysochloris asiatica, Elephantulus edwardii, Erinaceus europaeus,Ochotona princeps, Sorex araneus, Monodelphis domestica, Echinopstelfairi, and Condylura cristata.

An exemplary species list of Arc or endo-Gag proteins for expression asa recombinant or engineered Arc or endo-Gag polypeptide includes: Vulpesvulpes, Canis lupus dingo, Felis catus, Panthera pardus, Callorhinusursinus, Odobenus rosmarus divergens, Equus asinus, Sus scrofa, Manisjavanica, Ceratotherium simum simum, Leptonychotes weddellii, Enhydralutris kenyoni, Lipotes vexillifer, Bos grunniens, Bubalus bubalis,Camelus dromedarius, Vicugna pacos, Orcinus orca, Neomonachusschauinslandi, Tursiops truncatus, Bos taurus, Capra hircus,Delphinapterus leucas, Ovis aries musimon, Balaenoptera acutorostratascammoni, Neophocaena asiaeorientalis asiaeorientalis, Miniopterusnatalensis, Pteropus alecto, Physeter catodon, Loxodonta africana,Orycteropus afer afer, Bos mutus, Desmodus rotundus, Hipposiderosarmiger, Ailuropoda melanoleuca, Trichechus manatus latirostris,Rousettus latirostris, Rousettus aegyptiacus, Eptesicus fuscus,Rhinolophus sinicus, Cervus elaphus hippelaphus, Odocoileus virginianustexanus, Pantholops hodgsonii, Camelus bactrianus, Sarcophilus harrisii,Phascolarctos cinereus, and Ornithorhynchus anatinus.

An exemplary species list of bird Arc or endo-Gag proteins forexpression as a recombinant or engineered Arc or endo-Gag polypeptideincludes: Gallus gallus, Corvus cornix, cornix, Panus major, Corvusbrachyrhynchos, Dromaius novaehollandiae, and Apteryx rowi.

An exemplary species list of reptile Arc protein for expression as arecombinant or engineered Arc or endo-Gag polypeptide includes: Pythonbivittatus, Pogona vitticeps, Anolis carolinensis, Protobothropsmucrosquamatus, Alligator sinensis, Crocodylus porosus, Gavialisgangeticus, Alligator mississippiensis, Pelodiscus sinensis, Terrapenemexicana triunguis, Chrysemys picta bellii, Chelonia mydas, Nanoranaparkeri, Xenopus tropicalis, Xenopus laevis, and Latimeria chalumnae,

An exemplary species list of fish Arc protein for expression as arecombinant or engineered Arc or endo-Gag polypeptide includes:Oncorhynchus mykiss, Acanthochromis polyacanthus, Oncorhynchus kisutch,Carassius auratus, and Austrofundulus limnaeus.

An exemplary species list of insect Arc or endo-Gag proteins forexpression as a recombinant or engineered Arc or endo-Gag polypeptideincludes: Drosophila serrata, Drosophila bipectinata, Solenopsisinvicta, Temnothorax curvispinosus, Drosophila melanogaster, Agrilusplanipennis, Camponotus floridanus, Pogonomyrmex barbatus, Nilaparvatalugens, Bombyx mori, Tribolium castaneum, and Leptinotarsa decemlineata.

An exemplary species list of plant Arc or endo-Gag proteins forexpression as a recombinant or engineered Arc or endo-Gag polypeptideincludes Spinacia oleracea and Erythranthe guttata.

An exemplary species list of fungi proteins for expression as arecombinant or engineered Arc or endo-Gag polypeptide includes:Saccharomyces cerevisiae, Rhizopus delemar, Fusarium oxysporum,Cryptococcus neoformans, Rhizophagus irregularis, Fusarium fujikuroi,Candida albicans, Trichophyton rubrum, Pyrenophora tritici-repentis,Rhizopus microsporus, Rhizoctonia solani, Aspergillus flavus,Verticillium dahliae, Fusarium verticillioides, Aspergillus niger,Fusarium graminearum, Aspergillus fumigatus, Zymoseptoria tritici, andTrichoderma harzianum.

An exemplary species list of protists Arc or endo-Gag proteins forexpression as a recombinant or engineered Arc or endo-Gag polypeptideincludes: Entamoeba histolytica, Paulinella micropora, Guillardia theta,Oxyrrhis marina, Seminavis robusta, Euglena longa, naegleria gruberi,and Trichomonas vaginalis.

In some instances, Arc or endo-Gag comprises a capsid assembly/forming(CA) domain, a cargo binding domain (e.g., an RNA binding domain), andoptionally a matrix (MA) domain, a reverse transcriptase (RT) domain, ora combination thereof. In some cases, the CA domain is further dividedinto an N-lobe domain and a C-lobe domain. In some cases, the cargobinding domain comprises an RNA binding domain, a DNA binding domain, aprotein binding domain, a peptide binding domain, an antibody bindingdomain, a small molecule binding domain, or apeptidomimetic/nucleotidomimetic binding domain. Exemplary cargo bindingdomains include, but are not limited to, domains from GPCRs, antibodiesor binding fragments thereof, lipoproteins, integrins, tyrosine kinases,DNA-binding proteins, RNA-binding proteins, nucleases, ligases,proteases, integrases, isomerases, phosphatases, GTPases, aromatases,esterases, adaptor proteins, G-proteins, GEFs, cytokines, interleukins,interleukin receptors, interferons, interferon receptors, caspases,transcription factors, neurotrophic factors and their receptors, growthfactors and their receptors, signal recognition particle and receptorcomponents, extracellular matrix proteins, integral components ofmembrane, ribosomal proteins, translation elongation factors,translation initiation factors, GPI-anchored proteins, tissue factors,dystrophin, utrophin, dystrobrevin, any fusions, combinations, subunits,derivatives, or domains thereof.

In some embodiments, one or more non-essential regions which are notinvolved in capsid formation or nucleic acid binding are removed from anArc or endo-Gag protein to generate an Arc or endo-Gag polypeptide. Insuch instances, one or more non-essential regions, e.g., an N-terminalregion (e.g., up to 10 amino acids, up to 20 amino acids, up to 30 aminoacids, or up to 50 amino acids), a C-terminal region (e.g., up to 10amino acids, up to 20 amino acids, up to 30 amino acids, or up to 50amino acids), a RT domain, a MA domain, or a combination thereof, aredeleted from an Arc or endo-Gag protein to generate an Arc or endo-Gagpolypeptide. In some cases, only the essential regions involved incapsid assembly/forming and cargo binding remain in an Arc or endo-Gagpolypeptide. In additional cases, only the essential region involved incapsid assembly/forming (e.g., the N-lobe and/or the C-lobe) remains inan Arc polypeptide.

In certain embodiments, the RT domain, the MA domain, and/or theendogenous RNA binding domain are replaced with other cargo bindingdomains: for example, replaced with a DNA binding domain, a proteinbinding domain, a peptide binding domain, an antibody binding domain, asmall molecule binding domain, a peptidomimetic binding domain, or anucleotidomimetic binding domain. In some embodiments, an Arc orendo-Gag polypeptide comprises truncations or modifications of domainsinvolved in capsid forming, nucleic acid binding, or delivery.

In some embodiments, the Arc or endo-Gag polypeptide comprises a MAdomain, a CA N-lobe, a CA C-lobe, a cargo binding domain, and a RTdomain. In some instances, the Arc polypeptide comprises from N-terminusto C-terminus the following domains: the MA domain, the CA N-lobe, theCA C-lobe, the RT domain, and the cargo binding domain. In someinstances, the Arc or endo-Gag polypeptide comprises from N-terminus toC-terminus the following domains: the MA domain, the RT domain, thecargo binding domain, the CA N-lobe, and the CA C-lobe. In someinstances, the Arc or endo-Gag polypeptide comprises from N-terminus toC-terminus the following domains: the cargo binding domain, the MAdomain, the RT domain, the CA N-lobe, and the CA C-lobe. In someinstances, the domains are arranged in an order that does not impedecapsid assembly and cargo binding. In some instances, each of thedomains is either directly or indirectly fused to the respective twoflanking domains.

In some embodiments, the Arc or endo-Gag polypeptide comprises a MAdomain, a CA N-lobe, a CA C-lobe, and a cargo binding domain. In someinstances, the Arc or endo-Gag polypeptide comprises from N-terminus toC-terminus the following domains: the MA domain, the CA N-lobe, the CAC-lobe, and the cargo binding domain. In some instances, the Arcpolypeptide comprises from N-terminus to C-terminus the followingdomains: the MA domain, the cargo binding domain, the CA N-lobe, and theCA C-lobe. In some instances, the Arc or endo-Gag polypeptide comprisesfrom N-terminus to C-terminus the following domains: the cargo bindingdomain, the MA domain, the CA N-lobe, and the CA C-lobe. In someinstances, the domains are arranged in an order that does not impedecapsid assembly and cargo binding. In some instances, each of thedomains is either directly or indirectly fused to the respective twoflanking domains.

In some embodiments, the Arc or endo-Gag polypeptide comprises a CAN-lobe, a CA C-lobe, and a cargo binding domain. In some instances, theArc or endo-Gag polypeptide comprises from N-terminus to C-terminus thefollowing domains: the CA N-lobe, the CA C-lobe, and the cargo bindingdomain. In some instances, the Arc or endo-Gag polypeptide comprisesfrom N-terminus to C-terminus the following domains: the cargo bindingdomain, the CA N-lobe, and the CA C-lobe. In some instances, the domainsare arranged in an order that does not impede capsid assembly and cargobinding. In some instances, each of the domains is either directly orindirectly fused to the respective two flanking domains.

In some embodiments, the Arc or endo-Gag polypeptide comprises a CAN-lobe and a cargo binding domain. In some instances, the Arc orendo-Gag polypeptide comprises from N-terminus to C-terminus thefollowing domains: the CA N-lobe and the cargo binding domain. In someinstances, the Arc or endo-Gag polypeptide comprises from N-terminus toC-terminus the following domains: the cargo binding domain and the CAN-lobe. In some instances, the domains are arranged in an order thatdoes not impede capsid assembly and cargo binding. In some instances,the two domains are either directly or indirectly fused to each other.

In some embodiments, the Arc or endo-Gag polypeptide is engineered tocomprise a cargo binding domain, a CA domain, a MA domain, or a RTdomain from one or more additional species to generate an engineered Arcpolypeptide. For example, the engineered Arc or endo-Gag polypeptidecomprises a cargo binding domain, a CA domain, a MA domain, or a RTdomain from a first species and a cargo binding domain, a CA domain, aMA domain, or a RT domain from a second species. In some cases, thefirst species is selected from a eukaryote, a vertebrate, a human, amammal, a rodent, a bird, a reptile, a fish, an insect, a fungus, or aplant. In some cases, the second species is selected from a eukaryote, avertebrate, a human, a mammal, a rodent, a bird, a reptile, a fish, aninsect, a fungus, or a plant that is different from the first species.

In some instances, the engineered or endo-Gag Arc polypeptide comprisesa cargo binding domain from a first species and a CA domain (e.g., a CAN-lobe and optionally a CA C-lobe) from a second species. The engineeredArc or endo-Gag polypeptide optionally comprises a MA domain and an RTdomain from either the first species or the second species. In somecases, the first species is selected from a eukaryote, a vertebrate, ahuman, a mammal, a rodent, a bird, a reptile, a fish, an insect, afungus, or a plant. In some cases, the second species is selected from aeukaryote, a vertebrate, a human, a mammal, a rodent, a bird, a reptile,a fish, an insect, a fungus, or a plant that is different from the firstspecies.

In some instances, the engineered Arc or endo-Gag polypeptide comprisesa cargo binding domain, a first CA domain, a second CA domain, andoptionally a MA domain and/or a RT domain. In some cases, the cargobinding domain, the first CA domain, and optionally a MA domain and/or aRT domain are from a first species and the second CA domain is from asecond species. In some cases, the first CA domain is from a firstspecies and the cargo binding domain, the second CA domain, andoptionally a MA domain and/or a RT domain are from a second species. Insome instances, the domains are arranged in an order that does notimpede capsid assembly and cargo binding. In some instances, each of thedomains is either directly or indirectly fused to the respective twoadjacent domains.

In some instances, the engineered Arc or endo-Gag polypeptide comprisesa cargo binding domain, a first CA domain, and a second CA domain. Insome cases, the cargo binding domain and the first CA domain are from afirst species and the second CA domain is from a second species. In somecases, the first CA domain is from a first species and the cargo bindingdomain and the second CA domain are from a second species. In suchcases, the engineered Arc or endo-Gag polypeptide comprises from theN-terminus to the C-terminus the following domains: a cargo bindingdomain, a first CA domain, and a second CA domain. In such cases, theengineered Arc or endo-Gag polypeptide comprises from the N-terminus tothe C-terminus the following domains: a first CA domain, a cargo bindingdomain, and a second CA domain. In such cases, the engineered Arc orendo-Gag polypeptide comprises from the N-terminus to the C-terminus thefollowing domains: a first CA domain, a second CA domain, and a cargobinding domain. In some instances, the domains are arranged in an orderthat does not impede capsid assembly and cargo binding. In someinstances, each of the domains is either directly or indirectly fused tothe respective two flanking domains.

In some instances, the engineered Arc or endo-Gag polypeptide furthercomprises a second polypeptide. In some instances, the secondpolypeptide is fused directly or indirectly via a linker to one or moreof: a cargo binding domain, a first CA domain, a second CA domain, a MAdomain if present, or a RT domain if present. In some cases, the secondpolypeptide is a protein (e.g., a human protein), an antibody or bindingfragment thereof, a viral protein, a Gag-like protein (e.g., a humanGag-like protein), or a de novo engineered protein designed to bind to atarget receptor of interest. In some instances, the antibody or bindingfragment thereof comprises a humanized antibody or binding fragmentsthereof, a murine antibody or binding fragment thereof, a chimericantibody or binding fragment thereof, a monoclonal antibody or bindingfragment thereof, a multi-specific antibody or binding fragment thereof,a bispecific antibody or biding fragment thereof, a monovalent Fab′, adivalent Fab₂, F(ab)′₃ fragments, a single-chain variable fragment(scFv), a bis-scFv, an (scFv)₂, a diabody, a minibody, a nanobody, atriabody, a tetrabody, a disulfide stabilized Fv protein (dsFv), asingle-domain antibody (sdAb), an Ig NAR, a camelid antibody or bindingfragment thereof, or a chemically modified derivative thereof. In someinstances, the second polypeptide guides the delivery of a capsid formedby the engineered Arc polypeptide to a target site of interest.

In some embodiments, a nucleic acid sequence or amino acid sequence ofthe disclosure (for example, encoding an Arc polypeptide or endo-Gagpolypeptide) has at least 70% homology, at least 71% homology, at least72% homology, at least 73% homology, at least 74% homology, at least 75%homology, at least 76% homology, at least 77% homology, at least 78%homology, at least 79% homology, at least 80% homology, at least 81%homology, at least 82% homology, at least 83% homology, at least 84%homology, at least 85% homology, at least 86% homology, at least 87%homology, at least 88% homology, at least 89% homology, at least 90%homology, at least 91% homology, at least 92% homology, at least 93%homology, at least 94% homology, at least 95% homology, at least 96%homology, at least 97% homology, at least 98% homology, at least 99%homology, at least 99.1% homology, at least 99.2% homology, at least99.3% homology, at least 99.4% homology, at least 99.5% homology, atleast 99.6% homology, at least 99.7% homology, at least 99.8% homology,at least 99.9% or at least 99.99% homology to an amino acid sequenceprovided herein. Various methods and software programs are used todetermine the homology between two or sequences, such as NCBI BLAST,Clustal W, MAFFT, Clustal Omega, AlignMe, Praline, or another suitablemethod or algorithm.

In certain embodiments, the Arc polypeptide is a human polypeptidehaving the amino acid sequence of SEQ ID NO: 1 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 1.

In certain embodiments, the Arc polypeptide is a killer whalepolypeptide having the amino acid sequence of SEQ ID NO: 2 or a sequencehaving at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 2.

In certain embodiments, the Arc polypeptide is a white tailed deerpolypeptide having the amino acid sequence of SEQ ID NO: 3 or a sequencehaving at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 3.

In certain embodiments, the Arc polypeptide is a platypus polypeptidehaving the amino acid sequence of SEQ ID NO: 4 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 4.

In certain embodiments, the Arc polypeptide is a goose polypeptidehaving the amino acid sequence of SEQ ID NO: 5 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 5.

In certain embodiments, the Arc polypeptide is a Dalmatian pelicanpolypeptide having the amino acid sequence of SEQ ID NO: 6 or a sequencehaving at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 6.

In certain embodiments, the Arc polypeptide is a white tailed eaglepolypeptide having the amino acid sequence of SEQ ID NO: 7 or a sequencehaving at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 7.

In certain embodiments, the Arc polypeptide is a king cobra polypeptidehaving the amino acid sequence of SEQ ID NO: 8 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 8.

In certain embodiments, the Arc polypeptide is a ray finned fishpolypeptide having the amino acid sequence of SEQ ID NO: 9 or a sequencehaving at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identity to SEQ ID NO: 9.

In certain embodiments, the Arc polypeptide is a sperm whale polypeptidehaving the amino acid sequence of SEQ ID NO: 10 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 10.

In certain embodiments, the Arc polypeptide is a turkey polypeptidehaving the amino acid sequence of SEQ ID NO: 11 or a sequence having atleast 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity toSEQ ID NO: 11.

In certain embodiments, the Arc polypeptide is a central bearded dragonpolypeptide having the amino acid sequence of SEQ ID NO: 12 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 12.

In certain embodiments, the Arc polypeptide is a Chinese alligatorpolypeptide having the amino acid sequence of SEQ ID NO: 13 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 13.

In certain embodiments, the Arc polypeptide is an American alligatorpolypeptide having the amino acid sequence of SEQ ID NO: 14 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 14.

In certain embodiments, the Arc polypeptide is a Japanese gekkopolypeptide having the amino acid sequence of SEQ ID NO: 15 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 15.

In certain embodiments, the endo-Gag polypeptide is a human PNMA3polypeptide having the amino acid sequence of SEQ ID NO: 16 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 16.

In certain embodiments, the endo-Gag polypeptide is a human PNMA5polypeptide having the amino acid sequence of SEQ ID NO: 17 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 17.

In certain embodiments, the endo-Gag polypeptide is a human PNMA6Apolypeptide having the amino acid sequence of SEQ ID NO: 18 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 18.

In certain embodiments, the endo-Gag polypeptide is a human PNMA6Bpolypeptide having the amino acid sequence of SEQ ID NO: 19 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 19.

In certain embodiments, the endo-Gag polypeptide is a human RTL3polypeptide having the amino acid sequence of SEQ ID NO: 20 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 20.

In certain embodiments, the endo-Gag polypeptide is a human RTL6polypeptide having the amino acid sequence of SEQ ID NO: 21 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 21.

In certain embodiments, the endo-Gag polypeptide is a human RTL8Apolypeptide having the amino acid sequence of SEQ ID NO: 22 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 22.

In certain embodiments, the endo-Gag polypeptide is a human RTL8Bpolypeptide having the amino acid sequence of SEQ ID NO: 23 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 23.

In certain embodiments, the endo-Gag polypeptide is a human BOPpolypeptide having the amino acid sequence of SEQ ID NO: 24 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 24.

In certain embodiments, the endo-Gag polypeptide is a human LDOC1polypeptide having the amino acid sequence of SEQ ID NO: 25 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 25.

In certain embodiments, the endo-Gag polypeptide is a human ZNF18polypeptide having the amino acid sequence of SEQ ID NO: 26 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 26.

In certain embodiments, the endo-Gag polypeptide is a human MOAP1polypeptide having the amino acid sequence of SEQ ID NO: 27 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 27.

In certain embodiments, the endo-Gag polypeptide is a human PEG10polypeptide having the amino acid sequence of SEQ ID NO: 28 or asequence having at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or99% identity to SEQ ID NO: 28.

In some cases, the recombinant Arc or endo-Gag polypeptide is an Arcpolypeptide illustrated in FIG. 1.

In some cases, the engineered Arc or endo-Gag polypeptide is anengineered Arc polypeptide illustrated in FIG. 2.

Linkers

In certain embodiments, a polypeptide of the disclosure comprises alinker. In some embodiments, the linker is a peptide linker. In someinstances, the linker is a rigid linker. In other instances, the linkeris a flexible linker. In some cases, the linker is a non-cleavablelinker. In other cases, the linker is a cleavable linker. In additionalcases, the linker comprises a linear structure, or a non-linearstructure (e.g., a cyclic structure).

In certain embodiments, non-cleavable linkers comprise short peptides ofvarying lengths. Exemplary non-cleavable linkers include (EAAAK)n (SEQID NO: 70), or (EAAAR)n (SEQ ID NO: 71), where n is from 1 to 5, and upto 30 residues of glutamic acid-proline or lysine-proline repeats. Insome embodiments, the non-cleavable linker comprises (GGGGS)n (SEQ IDNO: 72) or (GGGS)n (SEQ ID NO: 73), wherein n is 1 to 10;KESGSVSSEQLAQFRSLD (SEQ ID NO: 74); or EGKSSGSGSESKST (SEQ ID NO: 75).In some embodiments, the non-cleavable linker comprises a poly-Gly/Alapolymer.

In certain embodiments, the linker is a cleavable linker, e.g., anextracellular cleavable linker or an intracellular cleavable linker. Insome instances, the linker is designed for cleavage in the presence ofparticular conditions or in a particular environment (e.g., underphysiological condition). For example, the design of a linker forcleavage by specific conditions, such as by a specific enzyme, allowsthe targeting of cellular uptake to a specific location.

In some embodiments, the linker is a pH-sensitive linker. In oneinstance, the linker is cleaved under basic pH conditions. In otherinstance, the linker is cleaved under acidic pH conditions.

In some embodiments, the linker is cleaved in vivo by endogenous enzymes(e.g., proteases) such as serine proteases including but not limited tothrombin, metalloproteases, furin, cathepsin B, necrotic enzymes (e.g.,calpains), and the like. Exemplary cleavable linkers include, but arenot limited to, GGAANLVRGG (SEQ ID NO: 76); SGRIGFLRTA (SEQ ID NO: 77);SGRSA (SEQ ID NO: 78); GFLG (SEQ ID NO: 79); ALAL (SEQ ID NO: 80); FK;PIC(Et)F-F (SEQ ID NO: 81), where C(Et) indicates S-ethylcysteine;PR(S/T)(L/I)(S/T) (SEQ ID NO: 82); DEVD (SEQ ID NO: 83); GWEHDG (SEQ IDNO: 84); RPLALWRS (SEQ ID NO: 85); or a combination thereof.

Capsids

In some embodiments, disclosed herein is a capsid. In some instances,the capsid comprises an Arc polypeptide and/or an endo-Gag polypeptidesuch as a Copia protein, ASPRV1 protein, a protein from the SCAN domainfamily, a protein encoded by the Paraneoplastic Ma antigen family, aprotein or a combination of proteins chosen from the retrotransposonGag-like family, or a combination thereof. Exemplary endo-Gagpolypeptides are BOP, LDOC1, MOAP1, PEG10, PNMA3, PNMA5, PNMA6A, PNMA6B,RTL3, RTL6, RTL8A, RTL8B, and ZNF18. In some instances, the Arcpolypeptide, the Copia protein, the ASPRV1 protein, the protein from theSCAN domain family, the protein encoded by the Paraneoplastic Ma antigenfamily, and the protein or a combination of proteins chosen from theretrotransposon Gag-like family are each independently a full-lengthpolypeptide. In other instances, the Arc polypeptide, the Copia protein,the ASPRV1 protein, the protein from the SCAN domain family, the proteinencoded by the Paraneoplastic Ma antigen family, and the protein or acombination of proteins chosen from the retrotransposon Gag-like familyare each independently a functional fragment thereof, e.g., that iscapable of forming a subunit of a capsid.

Arc-Based Capsids and Endo-Gag-Based Capsids

In some embodiments, the capsid comprises an Arc-based capsid. In someembodiments, the capsid comprises an endo-Gag-based capsid. In someinstances, the Arc-based and/or endo-Gag capsid comprises a plurality ofrecombinant Arc polypeptides and/or endo-Gag polypeptides describedabove, a plurality of engineered Arc polypeptides and/or endo-Gagpolypeptides described above, or a combination thereof. In some cases,the Arc-based capsid comprises a plurality of recombinant Arcpolypeptides. In other cases, the Arc-based capsid comprises a pluralityof engineered Arc polypeptides. In some cases, the endo-Gag-based capsidcomprises a plurality of recombinant endo-Gag polypeptides. In othercases, the endo-Gag-based capsid comprises a plurality of engineeredendo-Gag polypeptides.

In some embodiments, the Arc-based or endo-Gag-based capsid comprises afirst plurality of Arc and/or endo-Gag polypeptides from a first speciesand a second plurality of Arc and/or endo-Gag polypeptides from at leasta second species. In some cases, the first species is selected from aeukaryote, a vertebrate, a human, a mammal, a rodent, a bird, a reptile,a fish, an insect, a fungus, or a plant. In some cases, the secondspecies is selected from a eukaryote, a vertebrate, a human, a mammal, arodent, a bird, a reptile, a fish, an insect, a fungus, or a plant thatis different from the first species.

In some instances, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is1:1, 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 20:1, 50:1, or 100:1.In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is1:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is2:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is4:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is5:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is8:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc polypeptides is 10:1. Insome cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is20:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is50:1. In some cases, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is100:1. In some instances, the ratio is the comparison in molarconcentration. In some instances, the ratio is the comparison in thenumber of capsid forming subunits (e.g., each of the or engineered Arcpolypeptide forms a capsid subunit).

In some instances, the ratio of the first plurality of Arc or endo-Gagpolypeptides to the second plurality of Arc or endo-Gag polypeptides is1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, 1:10, 1:20, or 1:50. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:2. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:5. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:8. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:10. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:20. In somecases, the ratio of the first plurality of Arc or endo-Gag polypeptidesto the second plurality of Arc or endo-Gag polypeptides is 1:50. In someinstances, the ratio is the comparison in molar concentration. In someinstances, the ratio is the comparison in the number of capsid formingsubunits (e.g., each of the recombinant or engineered Arc or endo-Gagpolypeptide forms a capsid subunit).

In some embodiments, the Arc-based capsid or endo-Gag-based capsidcomprises a plurality of recombinant or engineered Arc polypeptides anda plurality of non-Arc proteins. Exemplary species of non-Arc proteinsinclude but are not limited to, Copia, ASPRV1, a protein or acombination of proteins chosen from the SCAN domain family, a protein ora combination of proteins chosen from the Paraneoplastic Ma antigenfamily, and a protein or a combination of proteins chosen from theretrotransposon Gag-like family. Exemplary species of non-Arc proteinsinclude BOP, LDOC1, MOAP1, PEG10, PNMA3, PNMA5, PNMA6A, PNMA6B, RTL3,RTL6, RTL8A, RTL8B, and ZNF18.

In some instances, the ratio of the plurality of recombinant orengineered Arc polypeptides to the plurality of non-Arc proteins is 1:1,2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 20:1, 50:1, or 100:1. Insome cases, the ratio of the plurality of recombinant or engineered Arcpolypeptides to the plurality of non-Arc proteins is 1:1. In some cases,the ratio of the plurality of recombinant or engineered Arc polypeptidesto the plurality of non-Arc proteins is 2:1. In some cases, the ratio ofthe plurality of recombinant or engineered Arc polypeptides to theplurality of non-Arc proteins is 4:1. In some cases, the ratio of theplurality of recombinant or engineered Arc polypeptides to the pluralityof non-Arc proteins is 5:1. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 8:1. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 10:1. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 20:1. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 50:1. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 100:1. In some instances, the ratio is the comparison inmolar concentration. In some instances, the ratio is the comparison inthe number of capsid forming subunits (e.g., each of the recombinant orengineered Arc polypeptide forms a capsid subunit).

In some instances, the ratio of the plurality of recombinant orengineered Arc polypeptides to the plurality of non-Arc proteins is 1:2,1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9, 1:10, 1:20, or 1:50. In some cases,the ratio of the plurality of recombinant or engineered Arc polypeptidesto the plurality of non-Arc proteins is 1:2. In some cases, the ratio ofthe plurality of recombinant or engineered Arc polypeptides to theplurality of non-Arc proteins is 1:5. In some cases, the ratio of theplurality of recombinant or engineered Arc polypeptides to the pluralityof non-Arc proteins is 1:8. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 1:10. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 1:20. In some cases, the ratio of the plurality ofrecombinant or engineered Arc polypeptides to the plurality of non-Arcproteins is 1:50. In some instances, the ratio is the comparison inmolar concentration. In some instances, the ratio is the comparison inthe number of capsid forming subunits (e.g., each of the recombinant orengineered Arc polypeptide forms a capsid subunit).

In some embodiments, the capsid has a diameter of at least 1 nm, ormore. In some instances, the capsid has a diameter of at least 2 nm, 3nm, 4 nm, 5 nm, 10 nm, 15 nm, 20 nm, 25 nm, 30 nm, 40 nm, 50 nm, 60 nm,70 nm, 80 nm, 90 nm, 100 nm, 150 nm, 200 nm, 300 nm, 400 nm, 500 nm, 600nm, or more. In some instances, the capsid has a diameter of at least 5nm, or more. In some cases, the capsid has a diameter of at least 10 nm,or more. In some instances, the capsid has a diameter of at least 20 nm,or more. In some cases, the capsid has a diameter of at least 30 nm, ormore. In some cases, the capsid has a diameter of at least 40 nm, ormore. In some cases, the capsid has a diameter of at least 50 nm, ormore. In some cases, the capsid has a diameter of at least 80 nm, ormore. In some cases, the capsid has a diameter of at least 100 nm, ormore. In some cases, the capsid has a diameter of at least 200 nm, ormore. In some cases, the capsid has a diameter of at least 300 nm, ormore. In some cases, the capsid has a diameter of at least 400 nm, ormore. In some cases, the capsid has a diameter of at least 500 nm, ormore. In some cases, the capsid has a diameter of at least 600 nm, ormore.

In some embodiments, the capsid has a diameter of at most 1 nm, or less.In some instances, the capsid has a diameter of at most 2 nm, 3 nm, 4nm, 5 nm, 10 nm, 15 nm, 20 nm, 25 nm, 30 nm, 40 nm, 50 nm, 60 nm, 70 nm,80 nm, 90 nm, 100 nm, 150 nm, 200 nm, 300 nm, 400 nm, 500 nm, 600 nm, orless. In some instances, the capsid has a diameter of at most 5 nm, orless. In some cases, the capsid has a diameter of at most 10 nm, orless. In some instances, the capsid has a diameter of at most 20 nm, orless. In some cases, the capsid has a diameter of at most 30 nm, orless. In some cases, the capsid has a diameter of at least 40 nm, orless. In some cases, the capsid has a diameter of at least 50 nm, orless. In some cases, the capsid has a diameter of at least 80 nm, orless. In some cases, the capsid has a diameter of at least 100 nm, orless. In some cases, the capsid has a diameter of at least 200 nm, orless. In some cases, the capsid has a diameter of at least 300 nm, orless. In some cases, the capsid has a diameter of at least 400 nm, orless. In some cases, the capsid has a diameter of at least 500 nm, orless. In some cases, the capsid has a diameter of at least 600 nm, orless.

In some embodiments, the capsid has a diameter of about 1 nm, 2 nm, 3nm, 4 nm, 5 nm, 10 nm, 15 nm, 20 nm, 25 nm, 30 nm, 40 nm, 50 nm, 60 nm,70 nm, 80 nm, 90 nm, 100 nm, 150 nm, 200 nm, 300 nm, 400 nm, 500 nm, or600 nm. In some instances, the capsid has a diameter of about 5 nm. Insome cases, the capsid has a diameter of about 10 nm. In some instances,the capsid has a diameter of about 20 nm. In some cases, the capsid hasa diameter of about 30 nm. In some cases, the capsid has a diameter ofabout 40 nm. In some cases, the capsid has a diameter of about 50 nm. Insome cases, the capsid has a diameter of about 80 nm. In some cases, thecapsid has a diameter of about 100 nm. In some cases, the capsid has adiameter of about 200 nm. In some cases, the capsid has a diameter ofabout 300 nm. In some cases, the capsid has a diameter of about 400 nm.In some cases, the capsid has a diameter of about 500 nm. In some cases,the capsid has a diameter of about 600 nm.

In some embodiments, the capsid has a diameter of from about 1 nm toabout 600 nm. In some instances, the capsid has a diameter of from about2 nm to about 500 nm, from about 2 nm to about 400 nm, from about 2 nmto about 300 nm, from about 2 nm to about 200 nm, from about 2 nm toabout 100 nm, from about 2 nm to about 50 nm, from about 2 nm to about30 nm, from about 20 nm to about 400 nm, from about 20 nm to about 300nm, from about 20 nm to about 200 nm, from about 20 nm to about 100 nm,from about 20 nm to about 50 nm, from about 20 nm to about 30 nm, fromabout 30 nm to about 500 nm, from about 30 nm to about 400 nm, fromabout 30 nm to about 300 nm, from about 30 nm to about 200 nm, fromabout 30 nm to about 100 nm, from about 30 nm to about 50 nm, from about50 nm to about 300 nm, from about 50 nm to about 200 nm, from about 50nm to about 100 nm, from about 2 nm to about 25 nm, from about 2 nm toabout 20 nm, from about 2 nm to about 10 nm, from about 5 nm to about 25nm, from about 5 nm to about 20 nm, from about 5 nm to about 10 nm, fromabout 10 nm to about 25 nm, or from about 10 nm to about 20 nm.

In some embodiments, the capsid has a reduced off-target effect. In somecases, the off-target effect is less than 10%, 5%, 4%, 3%, 2%, 1%, or0.5%. In some cases, the off-target effect is no more than 10%, 5%, 4%,3%, 2%, 1%, or 0.5%.

In some cases, the capsid does not have an off-target effect.

In certain embodiments, the formation of Arc and/or endo-Gag-basedcapsids occurs either ex vivo or in vitro.

In some instances, the Arc and/or endo-Gag-based capsids is assembled invivo.

In some instances, the Arc and/or endo-Gag-based capsids is stable atroom temperature. In some cases, the Arc and/or endo-Gag-based capsidsis empty. In other cases, the Arc and/or endo-Gag-based capsids isloaded (for example, loaded with a cargo and/or a therapeutic agent,e.g., a DNA or an RNA).

In some instances, the Arc and/or endo-Gag-based capsids is stable at atemperature from about 2° C. to about 37° C. In some instances, the Arcand/or endo-Gag-based capsids is stable at a temperature from about 2°C. to about 8° C., about 2° C. to about 4° C., about 20° C. to about 37°C., about 25° C. to about 37° C., about 20° C. to about 30° C., about25° C. to about 30° C., or about 30° C. to about 37° C. In some cases,the Arc and/or endo-Gag-based capsid is empty. In other cases, the Arcand/or endo-Gag-based capsids is loaded (for example, loaded with acargo and/or a therapeutic agent, e.g., a DNA or an RNA).

In some instances, the Arc and/or endo-Gag-based capsids is stable forat least about 1 day, 2 days, 4 days, 5 days, 7 days, 14 days, 28 days,30 days, 60 days, 2 months, 3 months, 4 months, 5 months, 6 months, 12months, 18 months, 24 months, 3 years, 5 years, or longer. In some case,the Arc and/or endo-Gag-based capsids has minimum degradation, e.g.,less than 10%, 5%, 4%, 3%, 2%, 1%, 0.5% based on the total population ofthe Arc and/or endo-Gag-based capsids that is degraded. In some cases,the Arc and/or endo-Gag-based capsid is empty. In other cases, the Arcand/or endo-Gag-based capsids is loaded (for example, loaded with atherapeutic agent, e.g., a DNA or an RNA).

Additional Capsids

In some embodiments, the capsid comprises the Copia protein. In someinstances, the Copia protein is from Drosophila melanogaster(UniProtKB—P04146), Ceratitis capitate (UniProtKB—W8BHY5), or Drosophilasimulans (UniProtKB—Q08461).

In some embodiments, the capsid comprises the protein ASPRV1. The ASPRV1protein is a structural protein that participates in the development andmaintenance of the skin barrier. In some instances, the protein ASPRV1is from Homo sapiens (UniProtKB—Q53RT3).

In some embodiments, the capsid comprises a protein from the SCAN domainfamily. SCAN domain is a superfamily of zinc finger transcriptionfactors. SCAN domain is also known as leucine rich region (LeR) andfunctions as protein interaction domain that mediates self-associationor selective association with other proteins.

In some embodiments, the capsid comprises a protein from theParaneoplastic Ma antigen family. The Paraneoplastic Ma antigen familycomprises about 14 members of neuro- and testis-specific proteins.

In some embodiments, the capsid comprises a protein encoded by aRetrotransposon Gag-like gene.

In some embodiments, the capsid comprises BOP, LDOC1, MOAP1, PEG10,PNMA3, PNMA5, PNMA6A, PNMA6B, RTL3, RTL6, RTL8A, RTL8B, and/or ZNF18.

Cargos

In some embodiments, a composition of the disclosure (for example, acapsid) comprises a cargo. In some embodiments, the cargo is atherapeutic agent. In some embodiments, the cargo is a nucleic acidmolecule, a small molecule, a protein, a peptide, an antibody or bindingfragment thereof, a peptidomimetic, or a nucleotidomimetic. In someinstances, the cargo is a therapeutic cargo, comprising e.g., one ormore drugs. In some instances, the cargo comprises a diagnostic tool,for profiling, e.g., one or more markers (such as markers associateswith one or more disease phenotypes). In additional instances, the cargocomprises an imaging tool.

In some instances, the cargo is a nucleic acid molecule. Exemplarynucleic acid molecules include DNA, RNA, or a mixture of DNA and RNA. Insome instances, the nucleic acid molecule is a DNA polymer. In somecases, the DNA is a single stranded DNA polymer. In other cases, the DNAis a double stranded DNA polymer. In additional cases, the DNA is ahybrid of single and double stranded DNA polymer.

In some embodiments, the nucleic acid molecule is a RNA polymer, e.g., asingle stranded RNA polymer, a double stranded RNA polymer, or a hybridof single and double stranded RNA polymers. In some instances, the RNAcomprises and/or encodes an antisense oligoribonucleotide, a siRNA, anmRNA, a tRNA, an rRNA, a snRNA, a shRNA, microRNA, or a non-coding RNA.

In some embodiments, the nucleic acid molecule comprises a hybrid of DNAand RNA.

In some embodiments, the nucleic acid molecule is an antisenseoligonucleotide, optionally comprising DNA, RNA, or a hybrid of DNA andRNA.

In some instances, the nucleic acid molecule comprises and/or encodes anmRNA molecule.

In some embodiments, the nucleic acid molecule comprises and/or encodesan RNAi molecule. In some cases, the RNAi molecule is a microRNA (miRNA)molecule. In other cases, the RNAi molecule is a siRNA molecule. ThemiRNA and/or siRNA are optionally double-stranded or as a hairpin, andfurther optionally encapsulated as precursor molecules.

In some embodiments, the nucleic acid molecule is for use in a nucleicacid-based therapy. In some instances, the nucleic acid molecule is forregulating gene expression (e.g., modulating mRNA translation ordegradation), modulating RNA splicing, or RNA interference. In somecases, the nucleic acid molecule comprises and/or encodes an antisenseoligonucleotide, microRNA molecule, siRNA molecule, mRNA molecule, foruse in regulation of gene expression, modulating RNA splicing, or RNAinterference.

In some instances, the nucleic acid molecule is for use in gene editing.Exemplary gene editing systems include, but are not limited to,CRISPR-Cas systems, zinc finger nuclease (ZFN) systems, andtranscription activator-like effector nuclease (TALEN) systems. In somecases, the nucleic acid molecule comprises and/or encodes a componentinvolved in the CRISPR-Cas systems, ZFN systems, or the TALEN systems.

In some cases, the nucleic acid molecule is for use in antigenproduction for therapeutic and/or prophylactic vaccine production. Forexample, the nucleic acid molecule encodes an antigen that is expressedand elicits a desirable immune response (e.g., a pro-inflammatory immuneresponse, an anti-inflammatory immune response, an B cell response, anantibody response, a T cell response, a CD4+ T cell response, a CD8+ Tcell response, a Th1 immune response, a Th2 immune response, a Th17immune response, a Treg immune response, or a combination thereof).

In some cases, the nucleic acid molecule comprises a nucleic acidenzyme. Nucleic acid enzymes are RNA molecules (e.g., ribozymes) or DNAmolecules (e.g., deoxyribozymes) that have catalytic activities. In someinstances, the nucleic acid molecule is a ribozyme. In other instances,the nucleic acid molecule is a deoxyribozyme. In some cases, the nucleicacid molecule is a MNAzyme, which functions as a biosensor and/or amolecular switch (see, e.g., Mokany, et al., “MNAzymes, a versatile newclass of nucleic acid enzymes that can function as biosensors andmolecular switches,” JACS 132(2): 1051-1059 (2010)).

In some instances, exemplary targets of the nucleic acid moleculeinclude, but are not limited to, UL123 (human cytomegalovirus), APOB, AR(androgen receptor) gene, KRAS, PCSK9, CFTR, and SMN (e.g., SMN2).

In some embodiments, the nucleic acid molecule is at least 5 nucleotidesor more in length. In some instances, the nucleic acid molecule is atleast 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150,200, 250, 300, 400, 500, 1000, 1500, 2000, 3000, 4000, 5000, 6000, 7000,8000, 9000 nucleotides or more in length. In some instances, the nucleicacid molecule is at least 10 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 12 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 15nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 18 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 19 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 20nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 21 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 22 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 23nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 24 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 25 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 26nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 27 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 28 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 29nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 30 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 40 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 50nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 100 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 200 nucleotides or morein length. In some instances, the nucleic acid molecule is at least 300nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 500 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 1000 nucleotides ormore in length. In some instances, the nucleic acid molecule is at least2000 nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 3000 nucleotides or more in length. In someinstances, the nucleic acid molecule is at least 4000 nucleotides ormore in length. In some instances, the nucleic acid molecule is at least5000 nucleotides or more in length. In some instances, the nucleic acidmolecule is at least 8000 nucleotides or more in length.

In some embodiments, the nucleic acid molecule is at most 12 nucleotidesor less in length. In some instances, the nucleic acid molecule is atmost 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, 500, 1000,1500, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000 nucleotides or lessin length. In some instances, the nucleic acid molecule is at most 15nucleotides or less in length. In some instances, the nucleic acidmolecule is at most 18 nucleotides or less in length. In some instances,the nucleic acid molecule is at most 19 nucleotides or less in length.In some instances, the nucleic acid molecule is at most 20 nucleotidesor less in length. In some instances, the nucleic acid molecule is atmost 21 nucleotides or less in length. In some instances, the nucleicacid molecule is at most 22 nucleotides or less in length. In someinstances, the nucleic acid molecule is at most 23 nucleotides or lessin length. In some instances, the nucleic acid molecule is at most 24nucleotides or less in length. In some instances, the nucleic acidmolecule is at most 25 nucleotides or less in length. In some instances,the nucleic acid molecule is at most 26 nucleotides or less in length.In some instances, the nucleic acid molecule is at most 27 nucleotidesor less in length. In some instances, the nucleic acid molecule is atmost 28 nucleotides or less in length. In some instances, the nucleicacid molecule is at most 29 nucleotides or less in length. In someinstances, the nucleic acid molecule is at most 30 nucleotides or lessin length. In some instances, the nucleic acid molecule is at most 40nucleotides or less in length. In some instances, the nucleic acidmolecule is at most 50 nucleotides or less in length. In some instances,the nucleic acid molecule is at most 100 nucleotides or less in length.In some instances, the nucleic acid molecule is at most 200 nucleotidesor less in length. In some instances, the nucleic acid molecule is atmost 300 nucleotides or less in length. In some instances, the nucleicacid molecule is at most 500 nucleotides or less in length. In someinstances, the nucleic acid molecule is at most 1000 nucleotides or lessin length. In some instances, the nucleic acid molecule is at most 2000nucleotides or less in length. In some instances, the nucleic acidmolecule is at most 3000 nucleotides or less in length. In someinstances, the nucleic acid molecule is at most 4000 nucleotides or lessin length. In some instances, the nucleic acid molecule is at most 5000nucleotides or less in length. In some instances, the nucleic acidmolecule is at most 8000 nucleotides or less in length.

In some embodiments, the nucleic acid molecule is about 5 nucleotides inlength. In some instances, the nucleic acid molecule is about 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,27, 28, 29, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300,400, 500, 1000, 1500, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000nucleotides in length. In some instances, the nucleic acid molecule isabout 10 nucleotides in length. In some instances, the nucleic acidmolecule is about 12 nucleotides in length. In some instances, thenucleic acid molecule is about 15 nucleotides in length. In someinstances, the nucleic acid molecule is about 18 nucleotides in length.In some instances, the nucleic acid molecule is about 19 nucleotides inlength. In some instances, the nucleic acid molecule is about 20nucleotides in length. In some instances, the nucleic acid molecule isabout 21 nucleotides in length. In some instances, the nucleic acidmolecule is about 22 nucleotides in length. In some instances, thenucleic acid molecule is about 23 nucleotides in length. In someinstances, the nucleic acid molecule is about 24 nucleotides in length.In some instances, the nucleic acid molecule is about 25 nucleotides inlength. In some instances, the nucleic acid molecule is about 26nucleotides in length. In some instances, the nucleic acid molecule isabout 27 nucleotides in length. In some instances, the nucleic acidmolecule is about 28 nucleotides in length. In some instances, thenucleic acid molecule is about 29 nucleotides in length. In someinstances, the nucleic acid molecule is about 30 nucleotides in length.In some instances, the nucleic acid molecule is about 40 nucleotides inlength. In some instances, the nucleic acid molecule is about 50nucleotides in length. In some instances, the nucleic acid molecule isabout 100 nucleotides in length. In some instances, the nucleic acidmolecule is about 200 nucleotides in length. In some instances, thenucleic acid molecule is about 300 nucleotides in length. In someinstances, the nucleic acid molecule is about 500 nucleotides in length.In some instances, the nucleic acid molecule is about 1000 nucleotidesin length. In some instances, the nucleic acid molecule is about 2000nucleotides in length. In some instances, the nucleic acid molecule isabout 3000 nucleotides in length. In some instances, the nucleic acidmolecule is about 4000 nucleotides in length. In some instances, thenucleic acid molecule is about 5000 nucleotides in length. In someinstances, the nucleic acid molecule is about 8000 nucleotides inlength.

In some embodiments, the nucleic acid molecule is from about 5 to about10,000 nucleotides in length. In some instances, the nucleic acidmolecule is from about 5 to about 9000 nucleotides in length, from about5 to about 8000 nucleotides in length, from about 5 to about 7000nucleotides in length, from about 5 to about 6000 nucleotides in length,from about 5 to about 5000 nucleotides in length, from about 5 to about4000 nucleotides in length, from about 5 to about 3000 nucleotides inlength, from about 5 to about 2000 nucleotides in length, from about 5to about 1000 nucleotides in length, from about 5 to about 500nucleotides in length, from about 5 to about 100 nucleotides in length,from about 5 to about 50 nucleotides in length, from about 5 to about 40nucleotides in length, from about 5 to about 30 nucleotides in length,from about 5 to about 25 nucleotides in length, from about 5 to about 20nucleotides in length, from about 10 to about 8000 nucleotides inlength, from about 10 to about 7000 nucleotides in length, from about 10to about 6000 nucleotides in length, from about 10 to about 5000nucleotides in length, from about 10 to about 4000 nucleotides inlength, from about 10 to about 3000 nucleotides in length, from about 10to about 2000 nucleotides in length, from about 10 to about 1000nucleotides in length, from about 10 to about 500 nucleotides in length,from about 10 to about 100 nucleotides in length, from about 10 to about50 nucleotides in length, from about 10 to about 40 nucleotides inlength, from about 10 to about 30 nucleotides in length, from about 10to about 25 nucleotides in length, from about 10 to about 20 nucleotidesin length, from about 18 to about 8000 nucleotides in length, from about18 to about 7000 nucleotides in length, from about 18 to about 6000nucleotides in length, from about 18 to about 5000 nucleotides inlength, from about 18 to about 4000 nucleotides in length, from about 18to about 3000 nucleotides in length, from about 18 to about 2000nucleotides in length, from about 18 to about 1000 nucleotides inlength, from about 18 to about 500 nucleotides in length, from about 18to about 100 nucleotides in length, from about 18 to about 50nucleotides in length, from about 18 to about 40 nucleotides in length,from about 18 to about 30 nucleotides in length, from about 18 to about25 nucleotides in length, from about 12 to about 50 nucleotides inlength, from about 20 to about 40 nucleotides in length, from about 20to about 30 nucleotides in length, or from about 25 to about 30nucleotides in length.

In some embodiments, the nucleic acid molecule comprises natural,synthetic, or artificial nucleotide analogues or bases. In some cases,the nucleic acid molecule comprises combinations of DNA, RNA and/ornucleotide analogues. In some instances, the synthetic or artificialnucleotide analogues or bases comprise modifications at one or more ofribose moiety, phosphate moiety, nucleoside moiety, or a combinationthereof.

In some embodiments, a nucleotide analogue or artificial nucleotide basedescribed above comprises a nucleic acid with a modification at a 2′hydroxyl group of the ribose moiety. In some instances, the modificationincludes an H, OR, R, halo, SH, SR, NH2, NHR, NR2, or CN, wherein R isan alkyl moiety. Exemplary alkyl moiety includes, but is not limited to,halogens, sulfurs, thiols, thioethers, thioesters, amines (primary,secondary, or tertiary), amides, ethers, esters, alcohols and oxygen. Insome instances, the alkyl moiety further comprises a modification. Insome instances, the modification comprises an azo group, a keto group,an aldehyde group, a carboxyl group, a nitro group, a nitroso, group, anitrile group, a heterocycle (e.g., imidazole, hydrazino orhydroxylamino) group, an isocyanate or cyanate group, or a sulfurcontaining group (e.g., sulfoxide, sulfone, sulfide, or disulfide). Insome instances, the alkyl moiety further comprises a heterosubstitution. In some instances, the carbon of the heterocyclic group issubstituted by a nitrogen, oxygen or sulfur. In some instances, theheterocyclic substitution includes but is not limited to, morpholino,imidazole, and pyrrolidino.

In some instances, the modification at the 2′ hydroxyl group is a2′-O-methyl modification or a 2′-O-methoxyethyl (2′-O-MOE) modification.In some cases, the 2′-O-methyl modification adds a methyl group to the2′ hydroxyl group of the ribose moiety whereas the 2′O-methoxyethylmodification adds a methoxyethyl group to the 2′ hydroxyl group of theribose moiety.

In some instances, the modification at the 2′ hydroxyl group is a2′-O-aminopropyl modification in which an extended amine groupcomprising a propyl linker binds the amine group to the 2′ oxygen. Insome instances, this modification neutralizes the phosphate-derivedoverall negative charge of the oligonucleotide molecule by introducingone positive charge from the amine group per sugar and thereby improvescellular uptake properties due to its zwitterionic properties.

In some instances, the modification at the 2′ hydroxyl group is a lockedor bridged ribose modification (e.g., locked nucleic acid or LNA) inwhich the oxygen molecule bound at the 2′ carbon is linked to the 4′carbon by a methylene group, thus forming a2′-C,4′-C-oxy-methylene-linked bicyclic ribonucleotide monomer.

In some embodiments, additional modifications at the 2′ hydroxyl groupinclude 2′-deoxy, T-deoxy-2′-fluoro, 2′-O-aminopropyl (2′-O-AP),2′-O-dimethylaminoethyl (2′-O-DMAOE), 2′-O-dimethylaminopropyl(2′-O-DMAP), T-O-dimethylaminoethyloxyethyl (2′-O-DMAEOE), or2′-O—N-methylacetamido (2′-O-NMA).

In some embodiments, a nucleotide analogue comprises a modified basesuch as, but not limited to, 5-propynyluridine, 5-propynylcytidine,6-methyladenine, 6-methylguanine, N, N,-dimethyladenine,2-propyladenine, 2propylguanine, 2-aminoadenine, 1-methylinosine,3-methyluridine, 5-methylcytidine, 5-methyluridine and other nucleotideshaving a modification at the 5 position, 5-(2-amino) propyl uridine,5-halocytidine, 5-halouridine, 4-acetylcytidine, 1-methyladenosine,2-methyladenosine, 3-methylcytidine, 6-methyluridine, 2-methylguanosine,7-methylguanosine, 2, 2-dimethylguanosine, 5-methylaminoethyluridine,5-methyloxyuridine, deazanucleotides (such as 7-deaza-adenosine,6-azouridine, 6-azocytidine, or 6-azothymidine), 5-methyl-2-thiouridine,other thio bases (such as 2-thiouridine, 4-thiouridine, and2-thiocytidine), dihydrouridine, pseudouridine, queuosine, archaeosine,naphthyl and substituted naphthyl groups, any O- and N-alkylated purinesand pyrimidines (such as N6-methyladenosine,5-methylcarbonylmethyluridine, uridine 5-oxyacetic acid, pyridine-4-one,or pyridine-2-one), phenyl and modified phenyl groups such asaminophenol or 2,4, 6-trimethoxy benzene, modified cytosines that act asG-clamp nucleotides, 8-substituted adenines and guanines, 5-substituteduracils and thymines, azapyrimidines, carboxyhydroxyalkyl nucleotides,carboxyalkylaminoalkyl nucleotides, and alkylcarbonylalkylatednucleotides. Modified nucleotides also include those nucleotides thatare modified with respect to the sugar moiety, as well as nucleotideshaving sugars or analogs thereof that are not ribosyl. For example, thesugar moieties, in some cases are or are based on, mannoses, arabinoses,glucopyranoses, galactopyranoses, 4′-thioribose, and other sugars,heterocycles, or carbocycles. The term nucleotide also includesuniversal bases. By way of example, universal bases include but are notlimited to 3-nitropyrrole, 5-nitroindole, or nebularine.

In some embodiments, a nucleotide analogue further comprises amorpholino, a peptide nucleic acid (PNA), a methylphosphonatenucleotide, a thiolphosphonate nucleotide, a 2′-fluoroN3-P5′-phosphoramidite, or a 1′, 5′-anhydrohexitol nucleic acid (HNA).Morpholino or phosphorodiamidate morpholino oligo (PMO) comprisessynthetic molecules whose structure mimics natural nucleic acidstructure but deviates from the normal sugar and phosphate structures.In some instances, the five member ribose ring is substituted with a sixmember morpholino ring containing four carbons, one nitrogen, and oneoxygen. In some cases, the ribose monomers are linked by aphosphordiamidate group instead of a phosphate group. In such cases, thebackbone alterations remove all positive and negative charges makingmorpholinos neutral molecules capable of crossing cellular membraneswithout the aid of cellular delivery agents such as those used bycharged oligonucleotides.

In some embodiments, peptide nucleic acid (PNA) does not contain sugarring or phosphate linkage and the bases are attached and appropriatelyspaced by oligoglycine-like molecules, therefore, eliminating a backbonecharge.

In some embodiments, one or more modifications optionally occur at theinternucleotide linkage. In some instances, modified internucleotidelinkage includes, but is not limited to, phosphorothioates;phosphorodithioates; methylphosphonates; 5′-alkylenephosphonates;5′-methylphosphonate; 3′-alkylene phosphonates; borontrifluoridates;borano phosphate esters and selenophosphates of 3′-5′linkage or2′-5′linkage; phosphotriesters; thionoalkylphosphotriesters; hydrogenphosphonate linkages; alkyl phosphonates; alkylphosphonothioates;arylphosphonothioates; phosphoroselenoates; phosphorodiselenoates;phosphinates; phosphoramidates; 3′-alkylphosphoramidates;aminoalkylphosphoramidates; thionophosphoramidates;phosphoropiperazidates; phosphoroanilothioates; phosphoroanilidates;ketones; sulfones; sulfonamides; carbonates; carbamates;methylenehydrazos; methylenedimethylhydrazos; formacetals;thioformacetals; oximes; methyleneiminos; methylenemethyliminos;thioamidates; linkages with riboacetyl groups; aminoethyl glycine; silylor siloxane linkages; alkyl or cycloalkyl linkages with or withoutheteroatoms of, for example, 1 to 10 carbons that are saturated orunsaturated and/or substituted and/or contain heteroatoms; linkages withmorpholino structures, amides, or polyamides wherein the bases areattached to the aza nitrogens of the backbone directly or indirectly;and combinations thereof.

In some embodiments, one or more modifications comprise a modifiedphosphate backbone in which the modification generates a neutral oruncharged backbone. In some instances, the phosphate backbone ismodified by alkylation to generate an uncharged or neutral phosphatebackbone. As used herein, alkylation includes methylation, ethylation,and propylation. In some cases, an alkyl group, as used herein in thecontext of alkylation, refers to a linear or branched saturatedhydrocarbon group containing from 1 to 6 carbon atoms. In someinstances, exemplary alkyl groups include, but are not limited to,methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, sec-butyl,tert-butyl, n-pentyl, isopentyl, neopentyl, hexyl, isohexyl, 1,1-dimethylbutyl, 2,2-dimethylbutyl, 3.3-dimethylbutyl, and 2-ethylbutylgroups. In some cases, a modified phosphate is a phosphate group asdescribed in U.S. Pat. No. 9,481,905.

In some embodiments, additional modified phosphate backbones comprisemethylphosphonate, ethylphosphonate, methylthiophosphonate, ormethoxyphosphonate. In some cases, the modified phosphate ismethylphosphonate. In some cases, the modified phosphate isethylphosphonate. In some cases, the modified phosphate ismethylthiophosphonate. In some cases, the modified phosphate ismethoxyphosphonate.

In some embodiments, one or more modifications further optionallyinclude modifications of the ribose moiety, phosphate backbone and thenucleoside, or modifications of the nucleotide analogues at the 3′ orthe 5′ terminus. For example, the 3′ terminus optionally include a 3′cationic group, or by inverting the nucleoside at the 3′-terminus with a3′-3′ linkage. In another alternative, the 3′-terminus is optionallyconjugated with an aminoalkyl group, e.g., a 3′ C5-aminoalkyl dT. In anadditional alternative, the 3′-terminus is optionally conjugated with anabasic site, e.g., with an apurinic or apyrimidinic site. In someinstances, the 5′-terminus is conjugated with an aminoalkyl group, e.g.,a 5′-O-alkylamino substituent. In some cases, the 5′-terminus isconjugated with an abasic site, e.g., with an apurinic or apyrimidinicsite.

In some embodiments, exemplary nucleic acid cargos include, but are notlimited to, Fomivirsen, Mipomersen, AZD5312 (AstraZeneca), Nusinersen,and SB010 (Sterna Biologicals).

Small Molecules

In some embodiments, the cargo is a small molecule. In some instances,the small molecule is an inhibitor (e.g., a pan inhibitor or a selectiveinhibitor). In other instances, the small molecule is an activator. Inadditional cases, the small molecule is an agonist, antagonist, apartial agonist, a mixed agonist/antagonist, or a competitiveantagonist.

In some embodiments, the small molecule is a drug that falls under theclass of analgesics, antianxiety drugs, antiarrhythmics, antibacterials,antibiotics, anticoagulants and thrombolytics, anticonvulsants,antidepressants, antidiarrheals, antiemetics, antifungals,antihistamines, antihypertensives, anti-inflammatories, antineoplastics,antipsychotics, antipyretics, antivirals, barbiturates, beta-blockers,bronchodilators, cold cures, corticosteroids, cough suppressants,cytotoxics, decongestants, diuretics, expectorant, hormones,hypoglycemics, immunosuppressives, laxatives, muscle relaxants, sexhormones, sleeping drugs, or tranquilizers.

In some embodiments, the small molecule is an inhibitor, e.g., aninhibitor of a kinase pathway such as the Tyrosine kinase pathway or aSerine/Threonine kinase pathway. In some cases, the small molecule is adual protein kinase inhibitor. In some cases, the small molecule is alipid kinase inhibitor.

In some cases, the small molecule is a neuraminidase inhibitor.

In some cases, the small molecule is a carbonic anhydrase inhibitor.

In some embodiments, exemplary targets of the small molecule include,but are not limited to, vascular endothelial growth factor receptor 1(VEGFR1), vascular endothelial growth factor receptor 2 (VEGFR2),vascular endothelial growth factor receptor 3 (VEGFR3), fibroblastgrowth factor receptor 1 (FGFR1), fibroblast growth factor receptor 2(FGFR2), fibroblast growth factor receptor 3 (FGFR3), fibroblast growthfactor receptor 4 (FGFR4), cyclin-dependent kinase 4 (CDK4),cyclin-dependent kinase 6 (CDK6), a receptor tyrosine kinase, aphosphoinositide 3-kinase (PI3K) isoform (e.g., PI3Kδ, also known asp110δ), Janus kinase 1 (JAK1), Janus kinase 3 (JAK3), a receptor fromthe family of platelet-derived growth factor receptors (PDFG-R), andcarbonic anhydrase (e.g., carbonic anhydrase I).

In some embodiments, the small molecule targets a viral protein, e.g., aviral envelope protein. In some embodiments, the small moleculedecreases viral adsorption to a host cell. In some embodiments, thesmall molecule decreases viral entry into a host cell. In someembodiments, the small molecule decreases viral replication in a host ora host cell. In some embodiments, the small molecule decreases viralassembly.

In some embodiments, exemplary small molecule cargos include, but arenot limited to, lenvatinib, palbociclib, regorafenib, idelalisib,tofacitinib, nintedanib, zanamivir, ethoxzolamide, and artemisinin.

Proteins

In some embodiments, the cargo is a protein. In some instances, theprotein is a full-length protein. In other instances, the protein is afragment, e.g., a functional fragment. In some cases, the protein is anaturally occurring protein. In additional cases, the protein is a denovo engineered protein. In further cases, the protein is a fusionprotein. In further cases, the protein is a recombinant protein.Exemplary proteins include, but are not limited to, Fc fusion proteins,anticoagulants, blood factors, bone morphogenetic proteins, enzymes,growth factors, hormones, interferons, interleukins, and thrombolytics.

In some instances, the protein is for use in an enzyme replacementtherapy.

In some cases, the protein is for use in antigen production fortherapeutic and/or prophylactic vaccine production. For example, theprotein comprises an antigen that elicits a desirable immune response(e.g., a pro-inflammatory immune response, an anti-inflammatory immuneresponse, an B cell response, an antibody response, a T cell response, aCD4+ T cell response, a CD8+ T cell response, a Th1 immune response, aTh2 immune response, a Th17 immune response, a Treg immune response, ora combination thereof).

In some instances, exemplary protein cargos include, but are not limitedto, romiplostim, liraglutide, a human growth hormone (rHGH), humaninsulin (BHI), follicle-stimulating hormone (FSH), Factor VIII,erythropoietin (EPO), granulocyte colony-stimulating factor (G-CSF),alpha-galactosidase A, alpha-L-iduronidase,N-acetylgalactosamine-4-sulfatase, dornase alfa, tissue plasminogenactivator (TPA), glucocerebrosidase, interferon-beta-1a, insulin-likegrowth factor 1 (IGF-1), or rasburicase.

Peptides

In some embodiments, the cargo is a peptide. In some instances, thepeptide is a naturally occurring peptide. In other instances, thepeptide is an artificial engineered peptide or a recombinant peptide. Insome cases, the peptide targets a G-protein coupled receptor, an ionchannel, a microbe, an anti-microbial target, a catalytic or otherIg-family of receptors, an intracellular target, a membrane-anchoredtarget, or an extracellular target.

In some cases, the peptide comprises at least 2 amino acids. In somecases, the peptide comprises at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20,25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 amino acids. In some cases,the peptide comprises at least 10 amino acids. In some cases, thepeptide comprises at least 15 amino acids. In some cases, the peptidecomprises at least 20 amino acids. In some cases, the peptide comprisesat least 30 amino acids. In some cases, the peptide comprises at least40 amino acids. In some cases, the peptide comprises at least 50 aminoacids. In some cases, the peptide comprises at least 60 amino acids. Insome cases, the peptide comprises at least 70 amino acids. In somecases, the peptide comprises at least 80 amino acids. In some cases, thepeptide comprises at least 90 amino acids. In some cases, the peptidecomprises at least 100 amino acids.

In some cases, the peptide comprises at most 3 amino acids. In somecases, the peptide comprises at most 4, 5, 6, 7, 8, 9, 10, 15, 20, 25,30, 35, 40, 45, 50, 60, 70, 80, 90, 100 amino acids. In some cases, thepeptide comprises at most 10 amino acids. In some cases, the peptidecomprises at most 15 amino acids. In some cases, the peptide comprisesat most 20 amino acids. In some cases, the peptide comprises at most 30amino acids. In some cases, the peptide comprises at most 40 aminoacids. In some cases, the peptide comprises at most 50 amino acids. Insome cases, the peptide comprises at most 60 amino acids. In some cases,the peptide comprises at most 70 amino acids. In some cases, the peptidecomprises at most 80 amino acids. In some cases, the peptide comprisesat most 90 amino acids. In some cases, the peptide comprises at most 100amino acids.

In some cases, the peptide comprises from about 1 to about 10 kDa. Insome cases, the peptide comprises from about 1 to about 9 kDa, about 1to about 6 kDa, about 1 to about 5 kDa, about 1 to about 4 kDa, about 1to about 3 kDa, about 2 to about 8 kDa, about 2 to about 6 kDa, about 2to about 4 kDa, about 1.2 to about 2.8 kDa, about 1.5 to about 2.5 kDa,or about 1.5 to about 2 kDa.

In some embodiments, the peptide is a cyclic peptide. In some instances,the cyclic peptide is a macrocyclic peptide. In other instances, thecyclic peptide is a constrained peptide. The cyclic peptides areassembled with varied linkages, such as for example, head-to-tail,head-to-side-chain, side-chain to tail, and side-chain to side-chainlinkages. In some instances, a cyclic peptide (e.g., a macrocyclic or aconstrained peptide) has a molecular weight from about 500 Dalton toabout 2000 Dalton. In other instances, a cyclic peptide (e.g., amacrocyclic or a constrained peptide) ranges from about 10 amino acidsto about 100 amino acids, from about 10 amino acids to about 70 aminoacids, or from about 10 amino acids to about 50 amino acids.

In some cases, the peptide is for use in antigen production fortherapeutic and/or prophylactic vaccine production. For example, thepeptide comprises an antigen that elicits a desirable immune response(e.g., a pro-inflammatory immune response, an anti-inflammatory immuneresponse, an B cell response, an antibody response, a T cell response, aCD4+ T cell response, a CD8+ T cell response, a Th1 immune response, aTh2 immune response, a Th17 immune response, a Treg immune response, ora combination thereof).

In some embodiments, the peptide comprises natural amino acids,unnatural amino acids, or a combination thereof. In some instances, anamino acid residue refers to a molecule containing both an amino groupand a carboxyl group. Suitable amino acids include, without limitation,both the D- and L-isomers of the naturally-occurring amino acids, aswell as non-naturally occurring amino acids prepared by organicsynthesis or other metabolic routes. The term amino acid, as usedherein, includes, without limitation, α-amino acids, natural aminoacids, non-natural amino acids, and amino acid analogs.

In some instances, α-amino acid refers to a molecule containing both anamino group and a carboxyl group bound to a carbon which is designatedthe α-carbon.

In some instances, β-amino acid refers to a molecule containing both anamino group and a carboxyl group in a β configuration.

In some embodiments, an amino acid analog is a racemic mixture. In someinstances, the D isomer of the amino acid analog is used. In some cases,the L isomer of the amino acid analog is used. In some instances, theamino acid analog comprises chiral centers that are in the R or Sconfiguration.

In some embodiments, exemplary peptide cargos include, but are notlimited to, peginesatide, insulin, adrenocorticotropic hormone (ACTH),calcitonin, oxytocin, vasopressin, octreolide, and leuprorelin.

In some embodiments, exemplary peptide cargos include, but are notlimited to, Telavancin, Dalbavancin, Oritavancin, Anidulafungin,Lanreotide, Pasireotide, Romidepsin, Linaclotide, and Peginesatide.

Antibodies

In some embodiments, the cargo is an antibody or a binding fragmentthereof. In some instances, the antibody or binding fragment thereofcomprises a humanized antibody or binding fragment thereof, murineantibody or binding fragment thereof, chimeric antibody or bindingfragment thereof, monoclonal antibody or binding fragment thereof,bispecific antibody or biding fragment thereof, monovalent Fab′,divalent Fab₂, F(ab)′₃ fragments, single-chain variable fragment (scFv),bis-scFv, (scFv)₂, diabody, minibody, nanobody, triabody, tetrabody,disulfide stabilized Fv protein (dsFv), single-domain antibody (sdAb),Ig NAR, camelid antibody or binding fragment thereof, or a chemicallymodified derivative thereof.

In some instances, the antibody or binding fragment thereof recognizes acell surface protein. In some instances, the cell surface protein is anantigen expressed by a cancerous cell. In some instances, the cellsurface protein is a neoepitope. In some instances, the cell surfaceprotein comprises one or more mutations compared to a wild-type protein.Exemplary cancer antigens include, but are not limited to, alphafetoprotein, ASLG659, B7-H3, BAFF-R, Brevican, CA125 (MUC16), CA15-3,CA19-9, carcinoembryonic antigen (CEA), CA242, CRIPTO (CR, CR1, CRGF,CRIPTO, TDGF1, teratocarcinoma-derived growth factor), CTLA-4, CXCR5,E16 (LAT1, SLC7A5), FcRH2 (IFGP4, IRTA4, SPAP1A (SH2 domain containingphosphatase anchor protein 1a), SPAP1B, SPAP1C), epidermal growthfactor, ETBR, Fc receptor-like protein 1 (FCRH1), GEDA, HLA-DOB (Betasubunit of MHC class II molecule (Ia antigen), human chorionicgonadotropin, ICOS, IL-2 receptor, IL20Ra, Immunoglobulin superfamilyreceptor translocation associated 2 (IRTA2), L6, Lewis Y, Lewis X,MAGE-1, MAGE-2, MAGE-3, MAGE 4, MART1, mesothelin, MDP, MPF (SMR, MSLN),MCP1 (CCL2), macrophage inhibitory factor (MIF), MPG, MSG783, mucin,MUC1-KLH, Napi3b (SLC34A2), nectin-4, Neu oncogene product, NCA,placental alkaline phosphatase, prostate specific membrane antigen(PMSA), prostatic acid phosphatase, PSCA hlg, anti-transferrin receptor,p97, Purinergic receptor P2X ligand-gated ion channel 5 (P2X5), LY64(Lymphocyte antigen 64 (RP105), gp100, P21, six transmembrane epithelialantigen of prostate (STEAP1), STEAP2, Sema 5b, tumor-associatedglycoprotein 72 (TAG-72), TrpM4 (BR22450, F1120041, TRPM4, TRPM4B,transient receptor potential cation channel, subfamily M, member 4) andthe like.

In some instances, the cell surface protein comprises clusters ofdifferentiation (CD) cell surface markers. Exemplary CD cell surfacemarkers include, but are not limited to, CD1, CD2, CD3, CD4, CD5, CD6,CD7, CD8, CD9, CD10, CD11a, CD11b, CD11c, CD11d, CDw12, CD13, CD14,CD15, CD15s, CD16, CDw17, CD18, CD19, CD20, CD21, CD22, CD23, CD24,CD25, CD26, CD27, CD28, CD29, CD30, CD31, CD32, CD33, CD34, CD35, CD36,CD37, CD38, CD39, CD40, CD41, CD42, CD43, CD44, CD45, CD45RO, CD45RA,CD45RB, CD46, CD47, CD48, CD49a, CD49b, CD49c, CD49d, CD49e, CD49f,CD50, CD51, CD52, CD53, CD54, CD55, CD56, CD57, CD58, CD59, CDw60, CD61,CD62E, CD62L (L-selectin), CD62P, CD63, CD64, CD65, CD66a, CD66b, CD66c,CD66d, CD66e, CD71, CD79 (e.g., CD79a, CD79b), CD90, CD95 (Fas), CD103,CD104, CD125 (IL5RA), CD134 (OX40), CD137 (4-1BB), CD152 (CTLA-4),CD221, CD274, CD279 (PD-1), CD319 (SLAMF7), CD326 (EpCAM), and the like.

In some embodiments, exemplary antibodies or binding fragments thereofinclude, but are not limited to, zalutumumab (HuMax-EFGr, Genmab),abagovomab (Menarini), abituzumab (Merck), adecatumumab (MT201),alacizumab pegol, alemtuzumab (Campath®, MabCampath, or Campath-1H;Leukosite), AlloMune (BioTransplant), amatuximab (Morphotek, Inc.),anti-VEGF (Genetech), anatumomab mafenatox, apolizumab (hulD10),ascrinvacumab (Pfizer Inc.), atezolizumab (MPDL3280A; Genentech/Roche),B43.13 (OvaRex, AltaRex Corporation), basiliximab (Simulect®, Novartis),belimumab (Benlysta®, GlaxoSmithKline), bevacizumab (Avastin®,Genentech), blinatumomab (Blincyto, AMG103; Amgen), BEC2 (ImGloneSystems Inc.), carlumab (Janssen Biotech), catumaxomab (Removab, TrionPharma), CEAcide (Immunomedics), Cetuximab (Erbitux®, ImClone),citatuzumab bogatox (VB6-845), cixutumumab (IMC-A12, ImClone SystemsInc.), conatumumab (AMG 655, Amgen), dacetuzumab (SGN-40, huS2C6;Seattle Genetics, Inc.), daratumumab (Darzalex®, Janssen Biotech),detumomab, drozitumab (Genentech), durvalumab (MedImmune), dusigitumab(MedImmune), edrecolomab (MAb17-1A, Panorex, Glaxo Wellcome), elotuzumab(Empliciti™, Bristol-Myers Squibb), emibetuzumab (Eli Lilly),enavatuzumab (Facet Biotech Corp.), enfortumab vedotin (SeattleGenetics, Inc.), enoblituzumab (MGA271, MacroGenics, Inc.), ensituxumab(Neogenix Oncology, Inc.), epratuzumab (LymphoCide, Immunomedics, Inc.),ertumaxomab (Rexomun®, Trion Pharma), etaracizumab (Abegrin, MedImmune),farletuzumab (MORAb-003, Morphotek, Inc), FBTA05 (Lymphomun, TrionPharma), ficlatuzumab (AVEO Pharmaceuticals), figitumumab (CP-751871,Pfizer), flanvotumab (ImClone Systems), fresolimumab (GC1008,Aanofi-Aventis), futuximab, glaximab, ganitumab (Amgen), girentuximab(Rencarex®, Wilex AG), IMAB362 (Claudiximab, Ganymed PharmaceuticalsAG), imalumab (Baxalta), IMC-1C11 (ImClone Systems), IMC-C225 (ImcloneSystems Inc.), imgatuzumab (Genentech/Roche), intetumumab (Centocor,Inc.), ipilimumab (Yervoy®, Bristol-Myers Squibb), iratumumab (Medarex,Inc.), isatuximab (SAR650984, Sanofi-Aventis), labetuzumab (CEA-CIDE,Immunomedics), lexatumumab (ETR2-ST01, Cambridge Antibody Technology),lintuzumab (SGN-33, Seattle Genetics), lucatumumab (Novartis),lumiliximab, mapatumumab (HGS-ETR1, Human Genome Sciences), matuzumab(EMD 72000, Merck), milatuzumab (hLL1, Immunomedics, Inc.), mitumomab(BEC-2, ImClone Systems), narnatumab (ImClone Systems), necitumumab(Portrazza™, Eli Lilly), nesvacumab (Regeneron Pharmaceuticals),nimotuzumab (h-R3, BIOMAb EGFR, TheraClM, Theraloc, or CIMAher; BiotechPharmaceutical Co.), nivolumab (Opdivo®, Bristol-Myers Squibb),obinutuzumab (Gazyva or Gazyvaro; Hoffmann-La Roche), ocaratuzumab(AME-133v, LY2469298; Mentrik Biotech, LLC), ofatumumab (Arzerra®,Genmab), onartuzumab (Genentech), Ontuxizumab (Morphotek, Inc.),oregovomab (OvaRex®, AltaRex Corp.), otlertuzumab (EmergentBioSolutions), panitumumab (ABX-EGF, Amgen), pankomab (Glycotope GMBH),parsatuzumab (Genentech), patritumab, pembrolizumab (Keytruda®, Merck),pemtumomab (Theragyn, Antisoma), pertuzumab (Perjeta, Genentech),pidilizumab (CT-011, Medivation), polatuzumab vedotin (Genentech/Roche),pritumumab, racotumomab (Vaxira®, Recombio), ramucirumab (Cyramza®,ImClone Systems Inc.), rituximab (Rituxan®, Genentech), robatumumab(Schering-Plough), Seribantumab (Sanofi/Merrimack Pharmaceuticals,Inc.), sibrotuzumab, siltuximab (Sylvant™, Janssen Biotech), Smart MI95(Protein Design Labs, Inc.), Smart ID10 (Protein Design Labs, Inc.),tabalumab (LY2127399, Eli Lilly), taplitumomab paptox, tenatumomab,teprotumumab (Roche), tetulomab, TGN1412 (CD28-SuperMAB or TAB08),tigatuzumab (CD-1008, Daiichi Sankyo), tositumomab, trastuzumab(Herceptin®), tremelimumab (CP-672,206; Pfizer), tucotuzumab celmoleukin(EMD Pharmaceuticals), ublituximab, urelumab (BMS-663513, Bristol-MyersSquibb), volociximab (M200, Biogen Idec), and zatuximab.

In some instances, the antibody or binding fragments thereof is anantibody-drug conjugate (ADC). In some cases, the payload of the ADCcomprises, for example, but is not limited to, an auristatin derivative,maytansine, a maytansinoid, a taxane, a calicheamicin, cemadotin, aduocarmycin, a pyrrolobenzodiazepine (PDB), or a tubulysin. In someinstances, the payload comprises monomethyl auristatin E (MMAE) ormonomethyl auristatin F (MMAF). In some instances, the payload comprisesDM2 (mertansine) or DM4. In some instances, the payload comprises apyrrolobenzodiazepine dimer.

Additional Cargos

In some embodiments, the cargo is a peptidomimetic. A peptidomimetic isa small protein-like polymer designed to mimic a peptide. In someinstances, the peptidomimetic comprises D-peptides. In other instances,the peptidomimetic comprises L-peptides. Exemplary peptidomimeticsinclude peptoids and β-peptides.

In some embodiments, the cargo is a nucleotidomimetic.

Vectors and Expression Systems

In certain embodiments, the Arc polypeptides, endo-Gag polypeptides,engineered Arc and engineered endo-Gag polypeptides described supra areencoded by plasmid vectors. In some embodiments, vectors include anysuitable vectors derived from either a eukaryotic or prokaryoticsources. In some cases, vectors are obtained from bacteria (e.g. E.coli), insects, yeast (e.g. Pichia pastoris), algae, or mammaliansources.

Exemplary bacterial vectors include pACYC177, pASK75, pBAD vectorseries, pBADM vector series, pET vector series, pE™ vector series, pGEXvector series, pHAT, pHAT2, pMal-c2, pMal-p2, pQE vector series, pRSETA, pRSET B, pRSET C, pTrcHis2 series, pZA31-Luc, pZE21-MCS-1, pFLAG ATS,pFLAG CTS, pFLAG MAC, pFLAG Shift-12c, pTAC-MAT-1, pFLAG CTC, orpTAC-MAT-2.

Exemplary insect vectors include pFastBac1, pFastBac DUAL, pFastBac ET,pFastBac HTa, pFastBac HTb, pFastBac HTc, pFastBac M30a, pFastBact M30b,pFastBac, M30c, pVL1392, pVL1393, pVL1393 M10, pVL1393 M11, pVL1393 M12,FLAG vectors such as pPolh-FLAG1 or pPolh-MAT 2, or MAT vectors such aspPolh-MAT1, or pPolh-MAT2.

In some cases, yeast vectors include Gateway® pDEST™ 14 vector, Gateway®pDEST™ 15 vector, Gateway® pDEST™ 17 vector, Gateway® pDEST™ 24 vector,Gateway® pYES-DEST52 vector, pBAD-DEST49 Gateway® destination vector,pAO815 Pichia vector, pFLD1 Pichi pastoris vector, pGAPZA,B, & C Pichiapastoris vector, pPIC3.5K Pichia vector, pPIC6 A, B, & C Pichia vector,pPIC9K Pichia vector, pTEF1/Zeo, pYES2 yeast vector, pYES2/CT yeastvector, pYES2/NT A, B, & C yeast vector, or pYES3/CT yeast vector.

Exemplary algae vectors include pChlamy-4 vector or MCS vector.

Examples of mammalian vectors include transient expression vectors orstable expression vectors. Mammalian transient expression vectorsinclude p3×FLAG-CMV 8, pFLAG-Myc-CMV 19, pFLAG-Myc-CMV 23, pFLAG-CMV 2,pFLAG-CMV 6a,b,c, pFLAG-CMV 5.1, pFLAG-CMV 5a,b,c, p3×FLAG-CMV 7.1,pFLAG-CMV 20, p3×FLAG-Myc-CMV 24, pCMV-FLAG-MAT1, pCMV-FLAG-MAT2,pBICEP-CMV 3, or pBICEP-CMV 4. Mammalian stable expression vectorinclude pFLAG-CMV 3, p3×FLAG-CMV 9, p3×FLAG-CMV 13, pFLAG-Myc-CMV 21,p3×FLAG-Myc-CMV 25, pFLAG-CMV 4, p3×FLAG-CMV 10, p3×FLAG-CMV 14,pFLAG-Myc-CMV 22, p3×FLAG-Myc-CMV 26, pBICEP-CMV 1, or pBICEP-CMV 2.

In some instances, a cell-free system is a mixture of cytoplasmic and/ornuclear components from a cell and is used for in vitro nucleic acidsynthesis. In some cases, a cell-free system utilizes either prokaryoticcell components or eukaryotic cell components. Sometimes, a nucleic acidsynthesis is obtained in a cell-free system based on for exampleDrosophila cell, Xenopus egg, or HeLa cells (ATCC® CCL-2™). Exemplarycell-free systems include, but are not limited to, E. coli S30 Extractsystem, E. coli T7 S30 system, or PURExpress®.

Host Cells

In some embodiments, a host cell includes any suitable cell such as anaturally derived cell or a genetically modified cell. In someinstances, a host cell is a production host cell. In some instances, ahost cell is a eukaryotic cell. In other instances, a host cell is aprokaryotic cell. In some cases, a eukaryotic cell includes fungi (e.g.,a yeast cell), an animal cell, or a plant cell. In some cases, aprokaryotic cell is a bacterial cell. Examples of bacterial cell includegram-positive bacteria or gram-negative bacteria. In some embodimentsthe gram-negative bacteria is anaerobic, rod-shaped, or both.

In some instances, gram-positive bacteria include Actinobacteria,Firmicutes or Tenericutes. In some cases, gram-negative bacteria includeAquificae, Deinococcus-Thermus, Fibrobacteres-Chlorobi/Bacteroidetes(FCB group), Fusobacteria, Gemmatimonadetes, Nitrospirae,Planctomycetes-Verrucomicrobia/Chlamydiae (PVC group), Proteobacteria,Spirochaetes or Synergistetes. In some embodiments, bacteria isAcidobacteria, Chloroflexi, Chrysiogenetes, Cyanobacteria,Deferribacteres, Dictyoglomi, Thermodesulfobacteria or Thermotogae. Insome embodiments, a bacterial cell is Escherichia coli, Clostridiumbotulinum, or Coli bacilli.

Exemplary prokaryotic host cells include, but are not limited to, BL21,Mach1™ DH10B™, TOP10, DH5α, DH10Bac™, OmniMax™, MegaX™, DH12S™, INV110,TOP10F′, INVαF, TOP10/P3, ccdB Survival, PIR1, PIR2, Stbl2™, Stbl3™, orStbl4™.

In some instances, animal cells include a cell from a vertebrate or froman invertebrate. In some cases, an animal cell includes a cell from amarine invertebrate, fish, insects, amphibian, reptile, mammal, orhuman. In some cases, a fungus cell includes a yeast cell, such asbrewer's yeast, baker's yeast, or wine yeast.

Fungi include ascomycetes such as yeast, mold, filamentous fungi,basidiomycetes, or zygomycetes. In some instances, yeast includesAscomycota or Basidiomycota. In some cases, Ascomycota includesSaccharomycotina (true yeasts, e.g. Saccharomyces cerevisiae (baker'syeast)) or Taphrinomycotina (e.g. Schizosaccharomycetes (fissionyeasts)). In some cases, Basidiomycota includes Agaricomycotina (e.g.Tremellomycetes) or Pucciniomycotina (e.g. Microbotryomycetes).

Exemplary yeast or filamentous fungi include, for example, the genus:Saccharomyces, Schizosaccharomyces, Candida, Pichia, Hansenula,Kluyveromyces, Zygosaccharomyces, Yarrowia, Trichosporon, Rhodosporidi,Aspergillus, Fusarium, or Trichoderma. Exemplary yeast or filamentousfungi include, for example, the species: Saccharomyces cerevisiae,Schizosaccharomyces pombe, Candida utilis, Candida boidini, Candidaalbicans, Candida tropicalis, Candida stellatoidea, Candida glabrata,Candida krusei, Candida parapsilosis, Candida guilliermondii, Candidaviswanathii, Candida lusitaniae, Rhodotorula mucilaginosa, Pichiametanolica, Pichia angusta, Pichia pastoris, Pichia anomala, Hansenulapolymorpha, Kluyveromyces lactis, Zygosaccharomyces rouxii, Yarrowiahpolytica, Trichosporon pullulans, Rhodosporidium toru-Aspergillusniger, Aspergillus nidulans, Aspergillus awamori, Aspergillus oryzae,Trichoderma reesei, Yarrowia hpolytica, Brettanomyces bruxellensis,Candida stellata, Schizosaccharomyces pombe, Torulaspora delbrueckii,Zygosaccharomyces bailii, Cryptococcus neoformans, Cryptococcus gattii,or Saccharomyces boulardii.

Exemplary yeast host cells include, but are not limited to, Pichiapastoris yeast strains such as GS115, KM71H, SMD1168, SMD1168H, andX-33; and Saccharomyces cerevisiae yeast strain such as INVSc1.

In some instances, additional animal cells include cells obtained from amollusk, arthropod, annelid or sponge. In some cases, an additionalanimal cell is a mammalian cell, e.g., from a human, primate, ape,equine, bovine, porcine, canine, feline or rodent. In some cases, arodent includes mouse, rat, hamster, gerbil, hamster, chinchilla, fancyrat, or guinea pig.

Exemplary mammalian host cells include, but are not limited to, 293Acell line, 293FT cell line, 293F cells, 293 H cells, CHO DG44 cells,CHO—S cells, CHO-K1 cells, Expi293F™ cells, Flp-In™ T-REx™ 293 cellline, Flp-In™-293 cell line, Flp-In™-3T3 cell line, Flp-In™-BHK cellline, Flp-In™-CHO cell line, Flp-In™-CV-1 cell line, Flp-In™-Jurkat cellline, FreeStyle™ 293-F cells, FreeStyle™ CHO—S cells, GripTite™ 293 MSRcell line, GS-CHO cell line, HepaRG™ cells, T-REx™ Jurkat cell line,Per.C6 cells, T-REx™-293 cell line, T-REx™-CHO cell line, andT-REx™-HeLa cell line.

In some instances, a mammalian host cell is a primary cell. In someinstances, a mammalian host cell is a stable cell line, or a cell linethat has incorporated a genetic material of interest into its own genomeand has the capability to express the product of the genetic materialafter many generations of cell division. In some cases, a mammalian hostcell is a transient cell line, or a cell line that has not incorporateda genetic material of interest into its own genome and does not have thecapability to express the product of the genetic material after manygenerations of cell division.

Exemplary insect host cell include, but are not limited to, DrosophilaS2 cells, Sf9 cells, Sf21 cells, High Five™ cells, and expresSF+® cells.

In some instances, plant cells include a cell from algae. Exemplaryinsect cell lines include, but are not limited to, strains fromChlamydomonas reinhardtii 137c, or Synechococcus elongatus PPC 7942.

Methods of Use

Disclosed herein, in certain embodiments, are methods of preparing acapsid which encapsulates a cargo. In some embodiments, the methodcomprises incubating a plurality of Arc or endo-Gag polypeptides,engineered Arc or endo-Gag polypeptides, and/or recombinant Arc orendo-Gag polypeptides with a cargo in a solution for a time sufficientto generate a loaded Arc-based capsid or endo-Gag-based capsid.

In some instances, the method comprises mixing a solution comprising aplurality of engineered and/or recombinant Arc polypeptides with aplurality of non-Arc capsid forming subunits prior to incubating withthe cargo. In some cases, the plurality of non-Arc capsid formingsubunits are mixed with the plurality of engineered and/or recombinantArc polypeptides at a ratio of 1:1, 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1,9:1, or 10:1. In other cases, the plurality of non-Arc capsid formingsubunits are mixed with the plurality of engineered and/or recombinantArc polypeptides at a ratio of 1:2, 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 1:9,or 1:10.

In some cases, the time sufficient to generate a loaded Arc-based capsidor endo-Gag-based capsid is at least about 5 minutes, at least about 10minutes, at least about 20 minutes, at least about 30 minutes, at leastabout 1 hour, at least about 2 hours, at least about 4 hours, at leastabout 6 hours, at least about 10 hours, at least about 12 hours, atleast about 24 hours, or more.

In some cases, the Arc-based capsid or endo-Gag-based capsid is preparedat a temperature from about 2° C. to about 37° C. In some instances, theArc-based capsid or endo-Gag-based capsid is prepared at a temperaturefrom about 2° C. to about 8° C., about 2° C. to about 4° C., about 20°C. to about 37° C., about 25° C. to about 37° C., about 20° C. to about30° C., about 25° C. to about 30° C., or about 30° C. to about 37° C.

In some cases, the Arc-based capsid or endo-Gag-based capsid is preparedat room temperature.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for systemic administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for local administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for parenteral (e.g., intra-arterial,intra-articular, intradermal, intralesional, intramuscular, intraocular,intraosseous infusion, intraperitoneal, intrathecal, intravenous,intravitreal, or subcutaneous) administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for topical administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for oral administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for sublingual administration.

In some instances, the Arc-based capsid or endo-Gag-based capsid isfurther formulated for aerosol administration.

In certain embodiments, also described herein is a use of an Arc-basedcapsid or endo-Gag-based capsid for delivery of a cargo to a site ofinterest. In some instances, the method comprises contacting a cell atthe site of interest with an Arc-based capsid or endo-Gag-based capsidfor a time sufficient to facilitate cellular uptake of the capsid.

In some cases, the cell is a muscle cell, a skin cell, a blood cell, oran immune cell (e.g., a T cell or a B cell).

In some instances, the cell is a tumor cell, e.g., a solid tumor cell ora cell from a hematologic malignancy. In some cases, the solid tumorcell is a cell from a bladder cancer, breast cancer, brain cancer,colorectal cancer, kidney cancer, liver cancer, lung cancer, pancreaticcancer, prostate cancer, skin cancer, stomach cancer, or thyroid cancer.In some cases, the cell from a hematologic malignancy is from a B-cellmalignancy or a T-cell malignancy. In some cases, the cell is from aleukeuma, a lymphoma, a myeloma, chronic lymphocytic leukemia (CLL),small lymphocytic lymphoma (SLL), diffuse large B cell lymphoma (DLBCL),follicular lymphoma, mantle cell lymphoma, Burkitt lymphoma, cutaneousT-cell lymphoma, peripheral T cell lymphoma, multiple myeloma,plasmacytoma, acute lymphoblastic leukemia (ALL), acute myeloid leukemia(AML), or chronic myeloid leukemia (CML).

In some embodiments, the cell is a somatic cell. In some instances, thecell is a blood cell, a skin cell, a connective tissue cell, a bonecell, a muscle cell, or a cell from an organ.

In some embodiments, the cell is an epithelial cell, a connective tissuecell, a muscular cell, or a neuron.

In some instances, the cell is an endodermal cell, a mesodermal cell, oran ectodermal. In some instances, the endoderm comprises cells of therespiratory system, the intestine, the liver, the gallbladder, thepancreas, the islets of Langerhans, the thyroid, or the hindgut. In somecases, the mesoderm comprises osteochondroprogenitor cells, musclecells, cells from the digestive system, renal stem cells, cells from thereproductive system, cells from the circulatory system (such asendothelial cells). Exemplary cells from the ectoderm compriseepithelial cells, cells of the anterior pituitary, cells of theperipheral nervous system, cells of the neuroendocrine system, cells ofthe eyes, cells of the central nervous system, cells of the ependymal,or cells of the pineal gland. In some cases, cells derived from thecentral and peripheral nervous system comprise neurons, Schwann cells,satellite glial cells, oligodendrocytes, or astrocytes. In some cases,neurons further comprise interneurons, pyramidal neurons, gabaergicneurons, dopaminergic neurons, serotoninergic neurons, glutamatergicneurons, motor neurons from the spinal cord, or inhibitory spinalneurons.

In some embodiments, the cell is a stem cell or a progenitor cell. Insome cases, the cell is a mesenchymal stem or progenitor cell. In othercases, the cell is a hematopoietic stem or progenitor cell.

In some cases, a target protein is overexpressed or is depleted in thecell. In some cases, the target protein is overexpressed in the cell. Inadditional cases, the target protein is depleted in the cell.

In some cases, a target gene in the cell has one or more mutations.

In some cases, the cell comprises an impaired splicing mechanism.

In some instances, the Arc-based capsid is administered systemically toa subject in need thereof.

In other instances, the Arc-based capsid or endo-Gag-based capsid isadministered locally to a subject in need thereof.

In some embodiments, the Arc-based capsid or endo-Gag-based capsid isadministered parenterally, orally, topically, via sublingual, or byaerosol to a subject in need thereof. In some cases, the Arc-basedcapsid or endo-Gag-based capsid is administered parenterally to asubject in need thereof. In other cases, the Arc-based capsid orendo-Gag-based capsid is administered orally to a subject in needthereof. In additional cases, the Arc-based capsid or endo-Gag-basedcapsid is administered topically, via sublingual, or by aerosol to asubject in need thereof.

In some embodiments, a delivery component is combined with an Arc-basedcapsid or endo-Gag-based capsid for a targeted delivery to a site ofinterest. In some instances, the delivery component comprises a carrier,e.g., an extracellular vesicle such as a micelle, a liposome, or amicrovesicle; or a viral envelope.

In some instances, the delivery component serves as a primary deliveryvehicle for an Arc-based capsid or endo-Gag-based capsid which does notcomprise its own delivery component (e.g., in which the secondpolypeptide is not present). In such cases, the delivery componentdirects the Arc-based capsid or endo-Gag-based capsid to a target siteof interest and optionally facilitates intracellular uptake.

In other instances, the delivery component enhances target specificityand/or sensitivity of an Arc-based capsid's second polypeptide. In suchcases, the delivery component enhances the specificity and/or affinityof the Arc-based capsid or endo-Gag-based capsid to the target site. Inadditional cases, the delivery components enhances the specificityand/or affinity by about 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold,8-fold, 9-fold, 10-fold, 20-fold, 30-fold, 50-fold, 100-fold, 200-fold,500-fold, or more. In further cases, the delivery components enhancesthe specificity and/or affinity by about 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90%, 100%, 200%, 500%, or more. Further still, the deliverycomponent optionally minimizes off-target effect by about 2-fold,3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold,20-fold, 30-fold, 50-fold, 100-fold, 200-fold, 500-fold, or more.Further still, the delivery component optionally minimizes off-targeteffect by about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%,500%, or more.

In additional instances, the delivery component serves as a firstvehicle that transports an Arc-based capsid to a general target region(e.g., a tumor microenvironment) and the Arc-based or endo-Gag-basedcapsid's second polypeptide serves as a second delivery molecule thatdrives the Arc-based capsid or endo-Gag-based capsid to the specifictarget site and optionally facilitates intracellular uptake. In suchcases, the delivery component minimizes off-target effect by about2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold,20-fold, 30-fold, 50-fold, 100-fold, 200-fold, 500-fold, or more. Insuch cases, the delivery component minimizes off-target effect by about10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 500%, or more.

In further instances, the delivery component serves as a first vehiclethat transports an Arc-based capsid to a target site of interest and theArc-based or endo-Gag-based capsid's second polypeptide serves as asecond delivery molecule that facilitates intracellular uptake.

In some embodiments, the delivery component comprises an extracellularvesicle. In some instances, the extracellular vesicle comprises amicrovesicle, a liposome, or a micelle. In some instances, theextracellular vesicle has a diameter of from about 10 nm to about 2000nm, from about 10 nm to about 1000 nm, from about 10 nm to about 800 nm,from about 20 nm to about 600 nm, from about 30 nm to about 500 nm, fromabout 50 nm to about 200 nm, or from about 80 nm to about 100 nm.

In some embodiments, the delivery component comprises a microvesicle.Also known as circulating microvesicles or microparticles, microvesiclesare membrane-bound vesicles that comprise phospholipids. In someinstances, the microvesicle has a diameter of from about 50 nm to about1000 nm, from about 100 nm to about 800 nm, from about 200 nm to about500 nm, or from about 50 nm to about 400 nm.

In some instances, the microvesicle is originated from cell membraneinversion, exocytosis, shedding, blebbing, or budding. In someinstances, the microvesicles are generated from differentiated cells. Inother instances, the microvesicles are generated from undifferentiatedcells, e.g., by blast cells, progenitor cells, or stem cells.

In some embodiments, the delivery component comprises a liposome. Insome instances, the liposome comprises a plurality of lipopeptides,which are presented on the surface of the liposome, for targeteddelivery to a site or region of interest. In some cases, the liposomesfuse with the target cell, whereby the contents of the liposome are thenemptied into the target cell. In some cases, a liposome is endocytosedby cells that are phagocytic. Endocytosis is then followed byintralysosomal degradation of liposomal lipids and release of theencapsulated agents.

Exemplary liposomes suitable for incorporation include, and are notlimited to, multilamellar vesicles (MLV), oligolamellar vesicles (OLV),unilamellar vesicles (UV), small unilamellar vesicles (SUV),medium-sized unilamellar vesicles (MUV), large unilamellar vesicles(LUV), giant unilamellar vesicles (GUV), multivesicular vesicles (MVV),single or oligolamellar vesicles made by reverse-phase evaporationmethod (REV), multilamellar vesicles made by the reverse-phaseevaporation method (MLV-REV), stable plurilamellar vesicles (SPLV),frozen and thawed MLV (FATMLV), vesicles prepared by extrusion methods(VET), vesicles prepared by French press (FPV), vesicles prepared byfusion (FUV), dehydration-rehydration vesicles (DRV), and bubblesomes(BSV). In some instances, a liposome comprises Amphipol (A8-35).Techniques for preparing liposomes are described in, for example,COLLOIDAL DRUG DELIVERY SYSTEMS, vol. 66 (J. Kreuter ed., Marcel Dekker,Inc. (1994)).

Depending on the method of preparation, liposomes are unilamellar ormultilamellar, and vary in size with diameters ranging from about 20 nmto greater than about 1000 nm.

In some instances, liposomes provided herein also comprise carrierlipids. In some embodiments the carrier lipids are phospholipids.Carrier lipids capable of forming liposomes include, but are not limitedto, dipalmitoylphosphatidylcholine (DPPC), phosphatidylcholine (PC;lecithin), phosphatidic acid (PA), phosphatidylglycerol (PG),phosphatidylethanolamine (PE), or phosphatidylserine (PS). Othersuitable phospholipids further include distearoylphosphatidylcholine(DSPC), dimyristoylphosphatidylcholine (DMPC),dipalmitoylphosphatidyglycerol (DPPG), distearoylphosphatidyglycerol(DSPG), dimyristoylphosphatidylglycerol (DMPG), dipalmitoylphosphatidicacid (DPPA); dimyristoylphosphatidic acid (DMPA), distearoylphosphatidicacid (DSPA), dipalmitoylphosphatidylserine (DPPS),dimyristoylphosphatidylserine (DMPS), distearoylphosphatidylserine(DSPS), dipalmitoylphosphatidyethanolamine (DPPE),dimyristoylphosphatidylethanolamine (DMPE),distearoylphosphatidylethanolamine (DSPE) and the like, or combinationsthereof. In some embodiments, the liposomes further comprise a sterol(e.g., cholesterol) which modulates liposome formation. The carrierlipids are optionally any non-phosphate polar lipids.

In some embodiments, the delivery component comprises a micelle. In someinstances, the micelle has a diameter from about 2 nm to about 250 nm,from about 20 nm to about 200 nm, from about 20 nm to about 100 nm, orfrom about 50 to about 100 nm.

In some instances, the micelle is a polymeric micelle, characterized bya core shell structure, in which the hydrophobic core is surrounded by ahydrophilic shell. In some cases, the hydrophilic shell furthercomprises a hydrophilic polymer or copolymer and a pH sensitivecomponent.

Exemplary hydrophilic polymers or copolymers include, but are notlimited to, poly(N-substituted acrylamides), poly(N-acryloylpyrrolidine), poly(N-acryloyl piperidine), poly(N-acryl-L-amino acidamides), poly(ethyl oxazoline), methylcellulose, hydroxypropyl acrylate,hydroxyalkyl cellulose derivatives and poly(vinyl alcohol),poly(N-isopropylacrylamide), poly(N-vinyl-2-pyrrolidone),polyethyleneglycol derivatives, and combinations thereof.

The pH-sensitive moiety includes, but is not limited to, an alkylacrylicacid such as methacrylic acid, ethylacrylic acid, propyl acrylic acidand butyl acrylic acid, or an amino acid such as glutamic acid.

In some instances, the hydrophobic moiety constitutes the core of themicelle and includes, for example, a single alkyl chain, such asoctadecyl acrylate or a double chain alkyl compound such asphosphatidylethanolamine or dioctadecylamine. In some cases, thehydrophobic moiety is optionally a water insoluble polymer such as apoly(lactic acid) or a poly(e-caprolactone).

Polymeric micelles exhibiting pH-sensitive properties are alsocontemplated and are formed, e.g., by using pH-sensitive polymersincluding, but not limited to, copolymers from methacrylic acid,methacrylic acid esters and acrylic acid esters, polyvinyl acetatephthalate, hydroxypropyl methyl cellulose phthalate, cellulose acetatephthalate, or cellulose acetate trimellitate.

In some embodiments, the delivery component comprises a viral envelope.Viral envelopes comprise glycoproteins, phospholipids, and additionalproteins obtained from a host. In some instances, the viral envelope ispermissive to a wide range of target cells. In other instances, theviral envelope is non-permissive and is specific to a target cell ofinterest. In some cases, the viral envelope comprises a cell-specificbinding protein and optionally a fusogenic molecule that aids in thefusion of the cargo into a target cell. In some cases, the viralenvelope comprises an endogenous viral envelope. In other cases, theviral envelope is a modified envelop, comprising one or more foreignproteins.

In some instances, the viral envelope is derived from a DNA virus.Exemplary enveloped DNA viruses include viruses from the family ofHerpesviridae, Poxviridae, and Hepadnavirdae.

In other instances, the viral envelope is derived from an RNA virus.Exemplary enveloped RNA viruses include viruses from the family ofBunyaviridae, Coronaviridae, Filoviridae, Flaviviridae,Orthomyxoviridae, Paramyxoviridae, Rhabdoviridae, and Togaviridae.

In additional instances, the viral envelope is derived from a virus fromthe family of Retroviridae.

In some embodiments, the viral envelope is from an oncolytic virus, suchas an oncolytic DNA virus from the family of Herpesviridae (for example,HSV1) or Poxviridae (for example, Vaccinia virus and myxoma virus); oran oncolytic RNA virus from the family of Rhabdoviridae (for example,VSV) or Paramyxoviridae (for example MV and NDV).

In some instances, the viral envelope further comprises a foreign orengineered protein that binds to an antigen or a cell surface molecule.Exemplary antigens and cell surface molecules for targeting include, butare not limited to, P-glycoprotein, Her2/Neu, erythropoietin (EPO),epidermal growth factor receptor (EGFR), vascular endothelial growthfactor receptor (VEGF-R), cadherin, carcinoembryonic antigen (CEA), CD4.CD8, CD19. CD20, CD33, CD34, CD45, CD117 (c-kit), CD133, HLA-A. HLA-B,HLA-C, chemokine receptor 5 (CCRS), stem cell marker ABCG2 transporter,ovarian cancer antigen CA125, immunoglobulins, integrins, prostatespecific antigen (PSA), prostate stem cell antigen (PSCA), dendriticcell-specific intercellular adhesion molecule 3-grabbing nonintegrin(DC-SIGN), thyroglobulin, granulocyte-macrophage colony stimulatingfactor (GM-CSF), myogenic differentiation promoting factor-1 (MyoD-1),Leu-7 (CD57), LeuM-1, cell proliferation-associated human nuclearantigen defined by the monoclonal antibody Ki-67 (Ki-67), viral envelopeproteins, HIV gp120, or transferrin receptor.

In some embodiments, the Arc-based capsid or endo-Gag-based capsid isfor in vitro use.

In some instances, the Arc-based capsid or endo-Gag-based capsid is forex vivo use.

In some cases, the Arc-based capsid or endo-Gag-based capsid is for invivo use.

Kits/Article of Manufacture

Disclosed herein, in certain embodiments, are kits and articles ofmanufacture for use with one or more methods described herein. Such kitsinclude a carrier, package, or container that is compartmentalized toreceive one or more containers such as vials, tubes, and the like, eachof the container(s) comprising one of the separate elements to be usedin a method described herein. Suitable containers include, for example,bottles, vials, syringes, and test tubes. In one embodiment, thecontainers are formed from a variety of materials such as glass orplastic.

For example, the container(s) include a recombinant or engineered Arc orendo-Gag polypeptide described above. Such kits optionally include anidentifying description or label or instructions relating to its use inthe methods described herein. For example, a kit typically includeslabels listing contents and/or instructions for use, and package insertswith instructions for use. A set of instructions will also typically beincluded.

Certain Terminologies

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as is commonly understood. It is to be understoodthat the detailed description are exemplary and explanatory only and arenot restrictive of any subject matter claimed. In this application, theuse of the singular includes the plural unless specifically statedotherwise. It must be noted that, as used in the specification, thesingular forms “a,” “an” and “the” include plural referents unless thecontext clearly dictates otherwise. In this application, the use of “or”means “and/or” unless stated otherwise. Furthermore, use of the term“including” as well as other forms, such as “include”, “includes,” and“included,” is not limiting.

Although various features of the invention may be described in thecontext of a single embodiment, the features may also be providedseparately or in any suitable combination. Conversely, although theinvention may be described herein in the context of separate embodimentsfor clarity, the invention may also be implemented in a singleembodiment.

Reference in the specification to “some embodiments”, “an embodiment”,“one embodiment” or “other embodiments” means that a particular feature,structure, or characteristic described in connection with theembodiments is included in at least some embodiments, but notnecessarily all embodiments, of the inventions.

As used herein, ranges and amounts can be expressed as “about” aparticular value or range. About also includes the exact amount. Hence“about 5 μL” means “about 5 μL” and also “5 μL.” Generally, the term“about” includes an amount that would be expected to be withinexperimental error.

The section headings used herein are for organizational purposes onlyand are not to be construed as limiting the subject matter described.

As used herein, the sequence of a CA N-lobe described herein correspondsto the human CA N-lobe. In some instances, the human CA N-lobe comprisesresidues 207-278 of SEQ ID NO: 1. In some instances, a CA N-lobedescribed herein comprises about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97% 98% or 99% sequence identity to residue207-278 of SEQ ID NO: 1. In some cases, a CA N-lobe described hereinshares a structural similarity with the human CA N-lobe. For example, aCA N-lobe described herein shares about 80%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97% 98% or 99% structural similarity with the human CAN-lobe. In some cases, the CA N-lobe shares a high structural similarity(e.g., 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97% 98% or 99% structuralsimilarity) but does not share a high sequence identity (e.g., thesequence identity is lower than 80%, lower than 70%, lower than 60%,lower than 50%, lower than 40%, or lower than 30%). In some cases, theCA N-lobe comprises residues 207-278 of SEQ ID NO: 1.

As used herein, the sequence of a CA C-lobe described herein correspondsto the human CA C-lobe. In some instances, the human CA C-lobe comprisesresidues 278-370 of SEQ ID NO: 1. In some instances, a CA C-lobedescribed herein comprises about 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97% 98% or 99% sequence identity to residue278-370 of SEQ ID NO: 1. In some cases, a CA C-lobe described hereinshares a structural similarity with the human CA C-lobe. For example, aCA C-lobe described herein shares about 80%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97% 98% or 99% structural similarity with the human CAC-lobe. In some cases, the CA C-lobe shares a high structural similarity(e.g., 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97% 98% or 99% structuralsimilarity) but does not share a high sequence identity (e.g., thesequence identity is lower than 80%, lower than 70%, lower than 60%,lower than 50%, lower than 40%, or lower than 30%). In some cases, theCA C-lobe comprises residues 278-370 of SEQ ID NO: 1.

As used herein, the terms “individual(s)”, “subject(s)” and “patient(s)”mean any mammal. In some embodiments, the mammal is a human. In someembodiments, the mammal is a non-human. None of the terms require or arelimited to situations characterized by the supervision (e.g. constant orintermittent) of a health care worker (e.g. a doctor, a registerednurse, a nurse practitioner, a physician's assistant, an orderly or ahospice worker).

EXAMPLES

These examples are provided for illustrative purposes only and not tolimit the scope of the claims provided herein.

Example 1—Construction of DNA Vectors Encoding Recombinant Arc Proteinsand Engineered Arc Proteins

To construct recombinant DNA vectors for Arc expression, full lengthcDNA open reading frames, excluding the initial methionine, are insertedinto a cloning vector and subsequently transferred into an expressionvector according to standard methods. The same approach is used toconstruct recombinant DNA vectors for expressing endo-Gag proteins.Human Arc cDNA includes an annotated matrix domain (MA) and a capsiddomain. The capsid domain has an N-terminal lobe (NTD) and a C-terminallobe (CTD). FIG. 1 illustrates the structure of the Human Arc proteinand the predicted structure of Arc from Python, Platypus, and Orca.

cDNAs encoding engineered Arc proteins are optionally generated byrecombining Arc sequences from different species (FIG. 2), by insertingfunctional domains from other proteins into an Arc protein (FIG. 3A), bymodifying the sequence of an Arc protein (FIG. 3B), and/or by anycombination of the approaches exemplified in FIGS. 2-3. cDNAs encodingengineered endo-Gag proteins are likewise generated by recombiningendo-Gag sequences from different species, by inserting functionaldomains from other proteins into an endo-Gag protein, by modifying thesequence of an endo-Gag protein, and/or by any combination of theseapproaches. Furthermore, an engineered endo-Gag protein optionallycontains Arc sequences and an engineered Arc protein optionally containsendo-Gag sequences. Engineered Arc and endo-Gag protein monomersassemble into capsids.

cDNAs encoding the Arc and endo-Gag proteins of Table 1 were insertedinto an expression vector derived from pET-41 a(+) (EMD Millipore(Novagen) Cat #70566). The entire cloning site of pET-41 a(+) wasremoved and replaced with the DNA having the nucleotide sequence of SEQID NO: 57, which encodes an alternative N-terminal tag having the aminoacid sequence of SEQ ID NO: 58 and comprising a 6×His tag (SEQ ID NO:59), a 6 amino acid spacer (SEQ ID NO: 60), and an AcTEV™ cleavage site(SEQ ID NO: 61). Arc and endo-Gag open reading frames without theirstarting methionine codon were inserted after the AcTEV™ cleavage siteby Gibson assembly. Gibson D G, Young L, Chuang R Y, Venter J C,Hutchison C A 3rd, Smith H O (2009). “Enzymatic assembly of DNAmolecules up to several hundred kilobases”. Nature Methods. 6 (5):343-345. After expression and AcTEV™ cleavage, the N-terminus of theresulting Arc or endo-Gag protein has a single residual Glycine from theAcTEV™ cleavage site.

SEQ ID NO: 57 ATGCATCACCATCACCATCACGGCTCAGGGTCTGGTAGCGAAAATCTGTACTTCCAGGGG SEQ ID NO: 58 MHHHHHHGSGSGSENLYFQG SEQ ID NO: 59 HHHHHHSEQ ID NO: 60 GSGSGS SEQ ID NO: 61 ENLYFQG

TABLE 1 Sequences of Arc and endo-Gag polypeptides and nucleotides. SEQID NO: Gene Species Amino Name Common name Proper name Sequence ID acidDNA Arc Human Homo sapiens NP_056008.1 1 29 Arc Killer Whale Orcinusorca XP_004265337.1 2 30 Arc White Tailed Deer Odocoileus XP_020755692.13 31 virginianus texanus Arc Platypus Ornithorhynchus XP_001512750.1 432 anatinus Arc Goose Anser cygnoides XP_013046406.1 5 33 domesticus ArcDalmation Pelican Pelecanus crispus KFQ60200.1 6 34 Arc White TailedEagle Haliaeetus albicilla KFQ04633.1 7 35 Arc King Cobra OphiophagusETE60609.1 8 36 hannah Arc Ray Finned Fish Austrofundulus XP_013881732.19 37 limnaeus Arc Sperm Whale Physeter catodon XP_007119193.2 10 38 ArcTurkey Meleagris XP_010707654.1 11 39 gallopavo Arc Central BeardedPogona vitticeps XP_020633722.1 12 40 Dragon Arc Chinese AlligatorAlligator sinensis XP_006027442.1 13 41 Arc American Alligator AlligatorXP 019337372.1 14 42 mississippiensis Arc Japanese Gekko Gekko japonicusXP_015273745.1 15 43 PNMA3 Human Homo sapiens NP_001269464.1 16 44 PNMA5Human Homo sapiens NP_001096620.1 17 45 PNMA6A Human Homo sapiensNP_116271.3 18 46 PNMA6B Human Homo sapiens SP_P0C5W0.1 19 47 RTL3 HumanHomo sapiens NP_689907.1 20 48 RTL6 Human Homo sapiens NP_115663.2 21 49RTL8A Human Homo sapiens NP_001071640.1 22 50 RTL8B Human Homo sapiensNP_001071641.1 23 51 BOP Human Homo sapiens NP_078903.3 24 52 LDOC1Human Homo sapiens NP_036449.1 25 53 ZNF18 Human Homo sapiensNP_001290210.1 26 54 MOAP1 Human Homo sapiens AAG31786.1 27 55 PEG10Human Homo sapiens NP_055883.2 28 56

Example 2—Expression and Purification of Arc and Endo-Gag Proteins

Expression vectors constructs comprising Arc and endo-Gag open readingframes were transformed into the Rosetta 2 (DE3)pLysS E. coli strain(Millipore Sigma, Cat #71403). Arc or endo-Gag expression was inducedwith 0.1 mM IPTG followed by a 16-hour incubation at 16° C. Cell pelletswere lysed by sonication in 20 mM sodium phosphate pH 7.4, 0.1M NaCl, 40mM imidazole, 1 mM DTT, and 10% glycerol. The lysate was treated withexcess TURBO DNase (Thermo Fisher Scientific, Cat #AM2238), RNaseCocktail (Thermo Fisher Scientific, Cat #AM2286), and Benzonase Nuclease(Millipore Sigma, Cat #71205) to eliminate nucleic acids. NaCl was addedto lysate in order to adjust the NaCl concentration to 0.5 M followed bycentrifugation and filtration to remove cellular debris. 6×His-taggedrecombinant protein was loaded onto a HisTrap HP column (GE Healthcare,Cat #17-5247-01), washed with buffer A (20 mM sodium phosphate pH 7.4,0.5M NaCl, 40 mM imidazole, and 10% glycerol), and eluted with a lineargradient of buffer B (20 mM sodium phosphate pH 7.4, 0.5M NaCl, 500 mMimidazole, and 10% glycerol). Collection tubes were supplemented inadvance with 10 μl of 0.5 M EDTA pH 8.0 per 1 ml eluate. The resultingArc or endo-Gag protein is generally more than 95% pure as revealed bySDS-PAGE analysis, with a yield of up to 50 mg per 1 L of bacterialculture. FIG. 4A.

Residual nucleic acid was removed by anion exchange chromatography on amono Q 5/50 GL column (GE Healthcare, Cat #17516601). Before loading tothe column, recombinant protein was buffer exchanged to buffer C (20 mMTris-HCl pH 8.0, 100 mM NaCl, and 10% glycerol) using “Pierce ProteinConcentrator PES, 10K MWCO, 5-20 ml” (Thermo Scientific, Cat #88528)according to the manufacturer's protocol. After loading, the mono Qresin was washed with 2 ml of buffer C. Arc and endo-Gag proteins wereeluted using a linear gradient of buffer D (20 mM Tris-HCl pH 8.0, 500mM NaCl, and 10% glycerol). RNA efficiently separated from Arc andeluted at 600 mM NaCl (FIG. 4B).

The N-terminal 6×His tag and spacer were removed from concentrating peakfractions of the mono Q purified Arc using a 10 kDa MWCO PESconcentrator and then treating with 10% v/v of AcTEV™ Protease(Invitrogen™ #12575023). The cleavage efficiency is above 99% asrevealed by SDS-PAGE assay. The protein is then diluted into HisTrapBuffer A and cleaned with HisTrap HP resin. The resulting purified Archas an N-terminal Glycine residue and does not contain the initialmethionine.

Example 3—Capsid Assembly

Cleaved Arc protein (1 mg/mL) was loaded into a 20 kDa MWCO dialysiscassette and dialyzed overnight in 1M sodium phosphate (pH 7.5) at roomtemperature. The following day, the solution was removed from thecassette, transferred to microcentrifuge tubes, and spun at max speedfor 5 minutes in a tabletop centrifuge. The supernatant was transferredto a 100 kDa MWCO Regenerated Cellulose Amicon UltrafiltrationCentrifugal concentrator. The buffer was exchanged to PBS pH 7.5 and thevolume was reduced 20-fold.

Capsid assembly was assayed by transmission electron microscopy. EMgrids (Carbon Support Film, Square Grid, 400 mesh, 5-6 nm, Copper,CF400-Cu-UL) were prepared by glow discharge. A 5 μL sample of purifiedArc was applied to the grid for 20 seconds and then wicked away usingfilter paper. The grid was then washed with MilliQ H₂O, stained with 54of 1% Uranyl Acetate in H₂O for 30 seconds, and air dried for 1 minute.Images of Arc capsids were acquired using a FEI Talos L120C TEM equippedwith a Gatan 4 k×4 k OneView camera. FIG. 5 shows concentrated human Arccapsids. FIG. 6 shows capsids formed from recombinantly expressed Arcorthologs from other vertebrate species. FIG. 7 shows capsids formedfrom recombinantly expressed endo-Gag genes from other vertebratespecies.

Example 4—Selective Cellular Internalization of Arc Capsids

Capsids assembled from isolated recombinant human Arc protein (0.5mg/ml) were fluorescently labeled by reacting with a 50-molar excess ofNHS ester Alexa Fluor™ 594-NHS dye (Invitrogen™ #A20004) (dissolved inDMSO) in PBS (pH 8.5). Reactions were allowed to proceed for 2-hours inthe dark. Alexa594-labeled capsids were then dialyzed with PBS (pH 7.5)overnight at room temperature in the dark with at least two bufferexchanges to remove any unlabeled dye.

HeLa cells (ATCC® CCL-2™) were seeded 24-hours prior to the experimentin 96-well plates at counts such that they reach ˜80% confluency fortreatment. Labeled-capsids were then spiked into complete tissue culturemedia to a final capsid concentration of 0.05 mg/ml. Treatments proceedfor 4-hours at 37° C., and then cells are washed 3-times with imagingmedia (DMEM, no phenol red, with 10% FBS and 20 mM HEPES) containing 10ug/ml Hoechst nuclear stain prior to imaging. Fluorescence microscopyrevealed a punctate staining pattern, suggesting that the Arc capsidswere internalized by the HeLa cells (FIG. 8). Little or no intracellularstaining was observed after administration of Alexa Fluor™ 594-labeledbovine serum albumin (BSA) (final concentration of 0.05 mg/ml) or 45.6μM Alexa Fluor™ 594 under identical conditions.

Example 5—Heterologous RNA Delivery by Arc Capsids

Human Arc capsids were loaded with Cre RNA by spiking in excess RNAduring capsid formation (by dialysis into 1M sodium phosphate). CreRNA-loaded capsids were administered to HeLa cells in biologicaltriplicate at a final capsid concentration of 0.05 mg/ml for 4-hours at37° C. The cells were then washed 3-times with ice-cold 1×PBS prior toRNA extraction (Invitrogen™ TRIzol™ Reagent #15596026). Purifiedcell-associated RNA was quantified by qPCR in technical triplicate,normalizing values to cellular GAPDH-levels, and comparing toEscherichia coli rrsA mRNA and Arc RNA that could have carried over fromprotein purification. Table 2 shows primers used for the PCR reaction.The amount of cell-associated Cre RNA detected was >27-fold higher whenArc capsid were loaded with Cre RNA compared to control capsids notloaded with Cre RNA (FIG. 9).

TABLE 2 Primers for qPCR quantification of RNA deliveredby Arc capsids to HeLa cells Gene - Primer Sequence SEQ ID NO: GAPDH-FAAGCTCATTTCCTGGTATGACAACGA 62 GAPDH-R AGGGTCTCTCTCTTCCTCTTGTGCT 63rrsA-F GCTCAACCTGGGAACTGCATCTGAT 64 rrsA-R TAATCCTGTTTGCTCCCCACGCTTT 65Arc CDS-F GGCCCCTCAGCTCCAGTGATTC 66 Arc CDS-R CCTGTTGTCACTCTCCTGGCTCTGA67 Cre CDS-F GCCAAGACATAAGAAACCTCGCCT 68 Cre CDS-RGTGAATCAACATCCTCCCTCCGTC 69

FIG. 10 illustrates an alternative method of demonstrating the deliveryof a heterologous RNA by an Arc or endo-Gag capsid. 6×His-tagged Arc orendo-Gag genes are expressed in a host cell. The resulting Arc monomersare mixed with translatable Cre mRNA under capsid forming conditions toform Cre mRNA loaded capsids. Cre-loaded capsids are then administeredto LoxP-luciferase reporter mice. Upon successful delivery of Cre mRNAinto mouse cells and subsequent translation of Cre recombinase protein,LoxP sites of the reporter are recombined, leading to luciferaseexpression, which is optionally detected by bioluminescence imaging uponadministration of luciferin. This method is used to test thetransmission potential of candidate Arc and endo-Gag genes. A positiveluciferase signal indicates that the candidate Arc or endo-Gag geneencodes an Arc or endo-Gag protein capable of assembling into capsidsthat incorporate a heterologous cargo and deliver that cargo to a targetcell.

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

TABLE 3 Arc and endo-Gag amino acid and nucleotide sequencesSEQ ID NO: 1GELDHRTSGGLHAYPGPRGGQVAKPNVILQIGKCRAEMLEHVRRTHRHLLAEVSKQVERELKGLHRSVGKLESNLDGYVPTSDSQRWKKSIKACLCRCQETIANLERWVKREMHVWREVFYRLERWADRLESTGGKYPVGSESARHTVSVGVGGPESYCHEADGYDYTVSPYAITPPPAAGELPGQEPAEAQQYQPWVPGEDGQPSPGVDTQIFEDPREFLSHLEEYLRQVGGSEEYWLSQIQNHMNGPAKKWWEFKQGSVKNWVEFKKEFLQYSEGTLSREAIQRELDLPQKQGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLRHPLPKTLEQLIQRGMEVQDDLEQAAEPAGPHLPVEDEAETLTPAPNSESVASDRTQPE SEQ ID NO: 2GELDQRTTGGLHAYPAPRGGPVAKPNVILQIGKCRAEMLEHVRRTHRHLLTEVSKQVERELKGLHRSVGKLESNLDGYVPTGDSQRWRKSIKACLCRCQETIANLERWVKREMHVWREVFYRLERWADRLESMGGKYPVGSNPSRHTTSVGVGGPESYGHEADTYDYTVSPYAITPPPAAGELPGQEAVEAQQYPPWGLGEDGQPSPGVDTQIFEDPREFLSHLEEYLRQVGGSEEYWLSQIQNHMNGPAKKWWEYKQGSVKNWVEFKKEFLQYSEGALSREAVQRELDLPQKQGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLRPPLPKTLEQLIQKGMEVEDGLEQVAEPASPHLPTEEESEALTPALTSESVASDRTQPE SEQ ID NO: 3GELDHRTTGGLHAYPAPRGGPAAKPNVILQIGKCRAEMLEHVRRTHRHLLAEVSKQVERELKGLHRSVGKLESNLDGYVPTGDSQRWKKSIKACLSRCQETIANLERWVKREMHVWREVFYRLERWADRLESGGGKYPVGSDPARHTVSVGVGGPESYCQDADNYDYTVSPYAITPPPAAGQLPGQEEVEAQQYPPWAPGEDGQLSPGVDTQVFEDPREFLRHLEDYLRQVGGSEEYWLSQIQNHMNGPAKKWWEYKQGSVKNWVEFKKEFLQYSEGTLSREAIQRELDLPQKQGEPLDQFLWRKRDLYQTLYVDAEEEEIIQYVVGTLQPKLKRFLRPPLPKTLEQLIQKGMEVQDGLEQAAEPAAEEAEALTPALTNESVASDRTQPE SEQ ID NO: 4GELDRLNPSSGLHPSSGLHPYPGLRGGATAKPNVILQIGKCRAEMLEHVRKTHRHLLTEVSRQVERELKGLHKSVGKLESNLDGYVPSSDSQRWKKSIKACLSRCQETIAHLERWVKREMNVWREVFYRLERWADRLEAMGGKYPAGEQARRTVSVGVGGPETCCPGDESYDCPISPYAVPPSTGESPESLDQGDQHYQQWFALPEESPVSPGVDTQIFEDPREFLRHLEKYLKQVGGTEEDWLSQIQNHMNGPAKKWWEYKQGSVKNWLEFKKEFLQYSEGTLTRDALKRELDLPQKQGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLHHPLPKTLEQLIQRGQEVQNGLEPTDDPAGQRTQSEDNDESLTPAVTNESTASEGTLPE SEQ ID NO: 5GQLDNVTNAGIHSFQGHRGVANKPNVILQIGKCRAEMLEHVRRTHRHLLSEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLEKWADRLESMGGKYCPGEHGKQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPPPGEMPSIPQAHDSYQWVSVSEDAPASPVETQVFEDPREFLSHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKEGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQRGKEVQGNMDHSDEPSPQRTPEIQSGDSVESMPPSTTASPVPSNGTQPEPPSPPATVI SEQ ID NO: 6GQLDNVTNAGIHSFQGHRGVANKPNVILQIGKCRAEMLEHVRRTHRHLLSEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLEKWADRLESMGGKYCPGEHGKQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPPPGEVPSIPQAHDSYQWVSVSEDAPASPVETQVFEDPREFLSHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKEGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQRGKEVQGNMDHSEEPSPQRTPEIQSGDSVDSVPPSTTASPVPSNGTQPE SEQ ID NO: 7GQLDNVTNAGIHSFQGHRGVANKPNVILQIGKCRAEMLEHVRRTHRHLLSEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLEKWADRLESMGGKYCPGDHGKQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPPPGEVPSIPQAHDSYQWVSTSEDAPASPVETQVFEDPREFLSHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKEGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQRGKEVQGNMDHSEEPSPQRTPEIQSGDSVDSVPPSTTASPVPSNGTQPE SEQ ID NO: 8GSWGLQRHVADERRGLATPTYGAVCSIREKKASQLSGQSCLEKELLGWKCTEAIVEMMQVDNFNHGNLHSCQGHRGMANHKPNVILQIGKCRAEMLDHVRRTHRHLLTEVSKQVERELKSLQKSVGKLENNLEDHVPSAAENQRWKKSIKACLARCQETIAHLERWVKREINVWKEVFFRLEKWADRLESGGGKYGPGDQSRQTVSVGVGAPEIQPRKEEIYDYALDMSQMYALTPPPMGEDPNVPQSHDSYQWITISDDSPPSPVETQIFEDPREFLTHLEDYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWLEFKKEFLQYSEGTLTRDAIKQELDLPQKDGEPLDQFLWRKRDLYQTLYIDAEEEEVIQYVVGTLQPKLKRFLSHPYPKTLEQLIQRGKEVEGNLDNSEEPSPQRSPKHQLGGSVESLPPSSTASPVASDETHPDVSAPPVTVI SEQ ID NO: 9GDGETQAENPSTSLNNTDEDILEQLKKIVMDQQHLYQKELKASFEQLSRKMFSQMEQMNSKQTDLLLEHQKQTVKHVDKRVEYLRAQFDASLGWRLKEQHADITTKIIPEIIQTVKEDISLCLSTLCSIAEDIQTSRATIVTGHAAVQTHPVDLLGEHHLGTTGHPRLQSTRVGKPDDVPESPVSLFMQGEARSRIVGKSPIKLQFPTFGKANDSSDPLQYLERCEDFLALNPLTDEELMATLRNVLHGTSRDWWDVARHKIQTWREFNKHFRAAFLSEDYEDELAERVRNRIQKEDESIRDFAYMYQSLCKRWNPAICEGDVVKLILKNINPQLPSQLRSRVTTVDELVRLGQQLEKDRQNQLQYELRKSSGKIIQKSSSCETSALPNTKSTPNQQNPATSNRPPQVYCWRCKGHHAPASCPQWKADKHRAQPSRSSGPQTLTNLQAQDI SEQ ID NO: 10GELDQRAAGGLRAYPAPRGGPVAKPSVILQIGKCRAEMLEHVRRTHRHLLTEVSKQVERELKGLHRSVGKLEGNLDGYVPTGDSQRWKKSIKACLCRCQETIANLERWVKREMHVWREVFYRLERWADRLESMGGKYPVGTNPSRHTVSVGVGGPEGYSHEADTYDYTVSPYAITPPPAAGELPGQEAVEAQQYPPWGLGEDGQPGPGVDTQIFEDPREFLSHLEEYLRQVGGSEEYWLSQIQNHMNGPAKKWWEFKQGSVKNWVEFKKEFLQYSEGTLSREAIQRELDLPQKQGEPLDQFLWRKRDLYQTLYVDAEEEEIIQYVVGTLQPKLKRFLRPPLPKTLEQLIQKGMEVQDGLEQAAEPASPRLPPEEESEALTPALTSESVASDRTQPE SEQ ID NO: 11GQLDNVTNAGIHSFQGHRGVANKPNVILQIGKCRAEMLEHVRRTHRHLLSEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLEKWADRLESMGGKYCPGEHGKQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPGPGEVPSIPQAHDSYQWVSVSEDAPASPVETQIFEDPHEFLSHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKEGEPLDQFLWRKRDLYQTLYVDADEEEIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQRGKEVQGNMDHSEEPSPQRTPEIQSGDSVESMPPSTTASPVPSNGTQPEPPSPPATVI SEQ ID NO: 12GQLENINQGSLHAFQGHRGVVHNNKPNVILQIGKCRAEMLEHVRRTHRHLLTEVSKQVERELKGLQKSVGKLENNLEDHVPSAAENQRWKKSIKACLARCQETIANLERWVKREMNVWKEVFFRLERWADRLESGGGKYCHADQGRQTVSVGVGGPEVRPSEGEIYDYALDMSQMYALTPPPMGDVPVIPQPHDSYQWVTDPEEAPPSPVETQIFEDPREFLTHLEDYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWLEFKKEFLQYSEGTLTRDAIKQELDLPQKEGEPLDQFLWRKRDLYQTLYVEAEEEEVIQYVVGTLQPKLKRFLSHPYPKTLEQLIQRGKEVEGNLDNSEEPSPQRTPEHQLGDSVESLPPSTTASPAGSDKTQPEISLPPTTVI SEQ ID NO: 13GQLDSVTNAGVHTYQGHRSVANKPNVILQIGKCRTEMLEHVRRTHRHLLTEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLERWADRLESMGGKYCPTDSARQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPSPGELPSVPQPHDSYQWVTSPEDAPASPVETQVFEDPREFLCHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDTVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKDGEPLDQFLWRKRDLYQTLYIDADEEQIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQKGKEVQGSLDHSEEPSPQRASEARTGDSVETLPPSTTTSPNTSSGTQPEAPSPPATVI SEQ ID NO: 14GQLDSVTNAGVHTYQGHRGVANKPNVILQIGKCRTEMLEHVRRTHRHLLTEVSKQVERELKGLQKSVGKLENNLEDHVPTDNQRWKKSIKACLARCQETIAHLERWVKREMNVWKEVFFRLERWADRLESMGGKYCPTDSARQTVSVGVGGPEIRPSEGEIYDYALDMSQMYALTPSPGELPSIPQPHDSYQWVTSPEDAPASPVETQVFEDPREFLCHLEEYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDTVKNWVEFKKEFLQYSEGTLTRDAIKRELDLPQKDGEPLDQFLWRKRDLYQTLYIDADEEQIIQYVVGTLQPKLKRFLSYPLPKTLEQLIQKGKEVQGSLDHSEEPSPQRASEARTGDSVESLPPSTTTSPNASSGTQPEAPSPPATVI SEQ ID NO: 15GQLENVNHGNLHSFQGHRGGVANKPNVILQIGKCRAEMLDHVRRTHRHLLTEVSKQVERELKGLQKSVGKLENNLEDHVPSAVENQRWKKSIKACLSRCQETIAHLERWVKREMNVWKEVFFRLERWADRLESGGGKYCHGDNHRQTVSVGVGGPEVRPSEGEIYDYALDMSQMYALTPPSPGDVPVVSQPHDSYQWVTVPEDTPPSPVETQIFEDPREFLTHLEDYLKQVGGTEEYWLSQIQNHMNGPAKKWWEYKQDSVKNWLEFKKEFLQYSEGTLTRDAIKEELDLPQKDGEPLDQFLWRKRDLYQTLYVEADEEEVIQYVVGTLQPKLKRFLSHPYPKTLEQLIQRGKEVEGNLDNSEEPTPQRTPEHQLCGSVESLPPSSTVSPVASDGTQPETSPLPATVI SEQ ID NO: 16GPLTLLQDWCRGEHLNTRRCMLILGIPEDCGEDEFEETLQEACRHLGRYRVIGRMFRREENAQAILLELAQDIDYALLPREIPGKGGPWEVIVKPRNSDGEFLNRLNRFLEEERRTVSDMNRVLGSDTNCSAPRVTISPEFWTWAQTLGAAVQPLLEQMLYRELRVFSGNTISIPGALAFDAWLEHTTEMLQMWQVPEGEKRRRLMECLRGPALQVVSGLRASNASITVEECLAALQQVFGPVESHKIAQVKLCKAYQEAGEKVSSFVLRLEPLLQRAVENNVVSRRNVNQTRLKRVLSGATLPDKLRDKLKLMKQRRKPPGFLALVKLLREEEEWEATLGPDRESLEGLEVAPRPPARITGVGAVPLPASGNSFDARPSQGYRRRRGRGQHRRGGVARAGSRGSRKRKRHTFCYSCGEDGHIRVQCINPSNLLLAKETKEILEGGEREAQTNSR SEQ ID NO: 17GALTLLEDWCKGMDMDPRKALLIVGIPMECSEVEIQDTVKAGLQPLCAYRVLGRMFRREDNAKAVFIELADTVNYTTLPSHIPGKGGSWEVVVKPRNPDDEFLSRLNYFLKDEGRSMTDVARALGCCSLPAESLDAEVMPQVRSPPLEPPKESMWYRKLKVFSGTASPSPGEETFEDWLEQVTEIMPIWQVSEVEKRRRLLESLRGPALSIMRVLQANNDSITVEQCLDALKQIFGDKEDFRASQFRFLQTSPKIGEKVSTFLLRLEPLLQKAVHKSPLSVRSTDMIRLKHLLARVAMTPALRGKLELLDQRGCPPNFLELMKLIRDEEEWENTEAVMKNKEKPSGRGRGASGRQARAEASVSAPQATVQARSFSDSSPQTIQGGLPPLVKRRRLLGSESTRGEDHGQATYPKAENQTPGREGPQAAGEELGNEAGAGAMSHPKPWET SEQ ID NO: 18GAVTMLQDWCRWMGVNARRGLLILGIPEDCDDAEFQESLEAALRPMGHFTVLGKAFREEDNATAALVELDREVNYALVPREIPGTGGPWNVVFVPRCSGEEFLGLGRVFHFPEQEGQMVESVAGALGVGLRRVCWLRSIGQAVQPWVEAVRCQSLGVFSGRDQPAPGEESFEVWLDHTTEMLHVWQGVSERERRRRLLEGLRGTALQLVHALLAENPARTAQDCLAALAQVFGDNESQATIRVKCLTAQQQSGERLSAFVLRLEVLLQKAMEKEALARASADRVRLRQMLTRAHLTEPLDEALRKLRMAGRSPSFLEMLGLVRESEAWEASLARSVRAQTQEGAGARAGAQAVARASTKVEAVPGGPGREPEGLLQAGGQEAEELLQEGLKPVLEECDN SEQ ID NO: 19GAVTMLQDWCRWMGVNARRGLLILGIPEDCDDAEFQESLEAALRPMGHFTVLGKVFREEDNATAALVELDREVNYALVPREIPGTGGPWNVVFVPRCSGEEFLGLGRVFHFPEQEGQMVESVAGALGVGLRRVCWLRSIGQAVQPWVEAVRYQSLGVFSGRDQPAPGEESFEVWLDHTTEMLHVWQGVSERERRRRLLEGLRGTALQLVHALLAENPARTAQDCLAALAQVFGDNESQATIRVKCLTAQQQSGERLSAFVLRLEVLLQKAMEKEALARASADRVRLRQMLTRAHLTEPLDEALRKLRMAGRSPSFLEMLGLVRESEAWEASLARSVRAQTQEGAGARAGAQAVARASTKVEAVPGGPGREPEGLRQAGGQEAEELLQEGLKPVLEECDN SEQ ID NO: 20GVEDLAASYIVLKLENEIRQAQVQWLMEENAALQAQIPELQKSQAAKEYDLLRKSSEAKEPQKLPEHMNPPAAWEAQKTPEFKEPQKPPEPQDLLPWEPPAAWELQEAPAAPESLAPPATRESQKPPMAHEIPTVLEGQGPANTQDATIAQEPKNSEPQDPPNIEKPQEAPEYQETAAQLEFLELPPPQEPLEPSNAQEFLELSAAQESLEGLIVVETSAASEFPQAPIGLEATDFPLQYTLTFSGDSQKLPEFLVQLYSYMRVRGHLYPTEAALVSFVGNCFSGRAGWWFQLLLDIQSPLLEQCESFIPVLQDTFDNPENMKDANQCIHQLCQGEGHVATHFHLIAQELNWDESTLWIQFQEGLASSIQDELSHTSPATNLSDLITQCISLEEKPDPNPLGKSSSAEGDGPESPPAENQPMQAAINCPHISEAEWVRWHKGRLCLYCGYPGHFARDCPVKPHQALQAGNIQACQ SEQ ID NO: 21GVQPQTSKAESPALAASPNAQMDDVIDTLTSLRLTNSALRREASTLRAEKANLTNMLESVMAELTLLRTRARIPGALQITPPISSITSNGTRPMTTPPTSLPEPFSGDPGRLAGFLMQMDRFMIFQASRFPGEAERVAFLVSRLTGEAEKWAIPHMQPDSPLRNNYQGFLAELRRTYKSPLRHARRAQIRKTSASNRAVRERQMLCRQLASAGTGPCPVHPASNGTSPAPALPARARNL SEQ ID NO: 22GDGRVQLMKALLAGPLRPAARRWRNPIPFPETFDGDTDRLPEFIVQTSSYMFVDENTFSNDALKVTFLITRLTGPALQWVIPYIRKESPLLNDYRGFLAEMKRVFGWEEDEDF SEQ ID NO: 23GEGRVQLMKALLARPLRPAARRWRNPIPFPETFDGDTDRLPEFIVQTSSYMFVDENTFSNDALKVTFLITRLTGPALQWVIPYIKKESPLLSDYRGFLAEMKRVFGWEEDEDF SEQ ID NO: 24GPRGRCRQQGPRIPIWAAANYANAHPWQQMDKASPGVAYTPLVDPWIERPCCGDTVCVRTTMEQKSTASGTCGGKPAERGPLAGHMPSSRPHRVDFCWVPGSDPGTFDGSPWLLDRFLAQLGDYMSFHFEHYQDNISRVCEILRRLTGRAQAWAAPYLDGDLPLPDDYELFCQDLKEVVQDPNSFAEYHAVVICPLPLASSQLPVAPQLPVVRQYLARFLEGLALDMGTAPRSLPAAMATPAVSGSNSVSRSALFEQQLTKESTPGPKEPPVLPSSTCSSKPGPVEPASSQPEEAAPTPVPRLSESANPPAQRPDPAHPGGPKPQKTEEEVLETEGDQEVSLGTPQEVVEAPETPGEPPLSPGFSEQ ID NO: 25GVDELVLLLHALLMRHRALSIENSQLMEQLRLLVCERASLLRQVRPPSCPVPFPETFNGESSRLPEFIVQTASYMLVNENRFCNDAMKVAFLISLLTGEAEEWVVPYIEMDSPILGDYRAFLDEMKQCFGWDDDEDDDDEEEEDDYSEQ ID NO: 26GPVDLGQALGLLPSLAKAEDSQFSESDAALQEELSSPETARQLFRQFRYQVMSGPHETLKQLRKLCFQWLQPEVHTKEQILEILMLEQFLTILPGEIQMWVRKQCPGSGEEAVTLVESLKGDPQRLWQWISIQVLGQDILSEKMESPSCQVGEVEPHLEVVPQELGLENSSSGPGELLSHIVKEESDTEAELALAASQPARLEERLIRDQDLGASLLPAAPQEQWRQLDSTQKEQYWDLMLETYGKMVSGAGISHPKSDLTNSIEFGEELAGIYLHVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQASGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQLSPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPMAQKLPTCRECGKTFYRNSQLIFHQRTHIGETYFQCTICKKAFLRSSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKPYKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSLDKHQRSHLGKKPFQ SEQ ID NO: 27GTLRLLEDWCRGMDMNPRKALLIAGISQSCSVAEIEEALQAGLAPLGEYRLLGRMFRRDENRKVALVGLTAETSHALVPKEIPGKGGIWRVIFKPPDPDNTFLSRLNEFLAGEGMTVGELSRALGHENGSLDPEQGMIPEMWAPMLAQALEALQPALQCLKYKKLRVFSGRESPEPGEEEFGRWMFHTTQMIKAWQVPDVEKRRRLLESLRGPALDVIRVLKINNPLITVDECLQALEEVFGVTDNPRELQVKYLTTYHKDEEKLSAYVLRLEPLLQKLVQRGAIERDAVNQARLDQVIAGAVHKTIRRELNLPEDGPAPGFLQLLVLIKDYEAAEEEEALLQAILEGNFT SEQ ID NO: 28GTERRRDELSEEINNLREKVMKQSEENNNLQSQVQKLTEENTTLREQVEPTPEDEDDDIELRGAAAAAAPPPPIEEECPEDLPEKFDGNPDMLAPFMAQCQIFMEKSTRDFSVDRVRVCFVTSMMTGRAARWASAKLERSHYLMHNYPAFMMEMKHVFEDPQRREVAKRKIRRLRQGMGSVIDYSNAFQMIAQDLDWNEPALIDQYHEGLSDHIQEELSHLEVAKSLSALIGQCIHIERRLARAAAARKPRSPPRALVLPHIASHHQVDPTEPVGGARMRLTQEEKERRRKLNLCLYCGTGGHYADNCPAKASKSSPAGKLPGPAVEGPSATGPEIIRSPQDDASSPHLQVMLQIHLPGRHTLFVRAMIDSGASGNFlDHEYVAQNGIPLRIKDWPILVEAIDGRPIASGPVVHETHDLIVDLGDHREVLSFDVTQSPFFPVVLGVRWLSTHDPNITWSTRSIVFDSEYCRYHCRMYSPIPPSLPPPAPQPPLYYPVDGYRVYQPVRYYYVQNVYTPVDEHVYPDHRLVDPHIEMIPGAHSIPSGHVYSLSEPEMAALRDFVARNVKDGLITPTIAPNGAQVLQVKRGWKLQVSYDCRAPNNFTIQNQYPRLSIPNLEDQAHLATYTEFVPQIPGYQTYPTYAAYPTYPVGFAWYPVGRDGQGRSLYVPVMITWNPHWYRQPPVPQYPPPQPPPPPPPPPPPPSYSTL SEQ ID NO: 29GGGGAGCTGGACCACCGGACCAGCGGCGGGCTCCACGCCTACCCCGGGCCGCGGGGCGGGCAGGTGGCCAAGCCCAACGTGATCCTGCAGATCGGGAAGTGCCGGGCCGAGATGCTGGAGCACGTGCGGCGGACGCACCGGCACCTGCTGGCCGAGGTGTCCAAGCAGGTGGAGCGCGAGCTGAAGGGGCTGCACCGGTCGGTCGGGAAGCTGGAGAGCAACCTGGACGGCTACGTGCCCACGAGCGACTCGCAGCGCTGGAAGAAGTCCATCAAGGCCTGCCTGTGCCGCTGCCAGGAGACCATCGCCAACCTGGAGCGCTGGGTCAAGCGCGAGATGCACGTGTGGCGCGAGGTGTTCTACCGCCTGGAGCGCTGGGCCGACCGCCTGGAGTCCACGGGCGGCAAGTACCCGGTGGGCAGCGAGTCAGCCCGCCACACCGTTTCCGTGGGCGTGGGGGGTCCCGAGAGCTACTGCCACGAGGCAGACGGCTACGACTACACCGTCAGCCCCTACGCCATCACCCCGCCCCCAGCCGCTGGCGAGCTGCCCGGGCAGGAGCCCGCCGAGGCCCAGCAGTACCAGCCGTGGGTCCCCGGCGAGGACGGGCAGCCCAGCCCCGGCGTGGACACGCAGATCTTCGAGGACCCTCGAGAGTTCCTGAGCCACCTAGAGGAGTACTTGCGGCAGGTGGGCGGCTCTGAGGAGTACTGGCTGTCCCAGATCCAGAATCACATGAACGGGCCGGCCAAGAAGTGGTGGGAGTTCAAGCAGGGCTCCGTGAAGAACTGGGTGGAGTTCAAGAAGGAGTTCCTGCAGTACAGCGAGGGCACGCTGTCCCGAGAGGCCATCCAGCGCGAGCTGGACCTGCCGCAGAAGCAGGGCGAGCCGCTGGACCAGTTCCTGTGGCGCAAGCGGGACCTGTACCAGACGCTCTACGTGGACGCGGACGAGGAGGAGATCATCCAGTACGTGGTGGGCACCCTGCAGCCCAAGCTCAAGCGTTTCCTGCGCCACCCCCTGCCCAAGACCCTGGAGCAGCTCATCCAGAGGGGCATGGAGGTGCAGGATGACCTGGAGCAGGCGGCCGAGCCGGCCGGCCCCCACCTCCCGGTGGAGGATGAGGCGGAGACCCTCACGCCCGCCCCCAACAGCGAGTCCGTGGCCAGTGACCGGACCCAGCCCGAG SEQ ID NO: 30GGGGAATTGGATCAACGTACTACCGGTGGCCTTCACGCATACCCTGCACCACGCGGGGGCCCTGTCGCGAAGCCAAATGTCATCCTGCAGATTGGGAAGTGCCGGGCTGAGATGCTGGAGCACGTCCGTCGGACGCATCGTCATCTTCTTACTGAGGTGTCAAAACAGGTGGAGCGTGAACTCAAAGGCTTGCACCGCAGCGTTGGGAAACTTGAAAGCAACTTAGATGGCTATGTGCCGACTGGCGACAGCCAGCGTTGGCGTAAGTCCATCAAAGCATGTTTGTGTCGTTGCCAGGAAACGATTGCAAACCTGGAGCGTTGGGTCAAACGGGAGATGCATGTCTGGCGTGAAGTATTTTATCGTTTAGAGCGTTGGGCCGATCGTTTAGAGAGCATGGGTGGTAAGTACCCTGTGGGGAGCAACCCTTCTCGGCATACGACGTCAGTCGGTGTTGGCGGGCCGGAGTCCTACGGTCATGAAGCGGACACCTACGACTATACCGTAAGCCCTTATGCTATTACCCCACCACCTGCGGCCGGCGAATTACCTGGCCAGGAAGCCGTTGAGGCTCAACAATACCCTCCTTGGGGGCTGGGCGAGGATGGTCAACCTAGCCCAGGGGTAGACACGCAAATCTTTGAGGACCCACGGGAGTTTCTTTCCCACCTGGAAGAATACCTGCGTCAGGTTGGTGGGAGCGAAGAATACTGGCTGTCACAAATTCAAAACCATATGAATGGTCCTGCAAAAAAATGGTGGGAATATAAACAGGGTTCCGTGAAAAACTGGGTTGAGTTTAAAAAGGAGTTTCTTCAATATTCCGAGGGCGCCCTCAGTCGGGAGGCGGTCCAACGCGAGTTGGACTTGCCACAGAAACAGGGGGAACCACTCGATCAATTCCTTTGGCGGAAACGTGACCTTTACCAGACATTGTACGTGGATGCAGATGAGGAAGAAATTATCCAATATGTTGTGGGGACCCTGCAGCCGAAACTGAAACGTTTCCTTCGCCCGCCGCTGCCTAAAACGTTGGAACAACTTATTCAGAAAGGTATGGAGGTCGAGGATGGCTTAGAACAAGTCGCAGAGCCGGCCTCGCCACACTTGCCTACAGAGGAGGAATCGGAGGCGCTGACCCCAGCACTTACATCAGAGTCAGTGGCATCAGACCGGACACAACCAGAG SEQ ID NO: 31GGGGAGTTAGATCACCGTACAACGGGGGGGTTGCACGCATACCCTGCTCCACGTGGCGGGCCGGCAGCTAAGCCAAACGTAATCCTGCAGATTGGGAAGTGCCGGGCAGAGATGTTGGAGCACGTCCGGCGGACCCACCGGCACCTCCTGGCTGAAGTGTCTAAACAAGTAGAACGGGAACTCAAAGGTCTTCATCGTAGCGTCGGGAAATTGGAATCGAATTTGGACGGGTATGTTCCTACAGGCGACTCACAGCGGTGGAAAAAGAGCATCAAGGCCTGCCTGAGTCGCTGCCAGGAGACGATTGCTAACCTCGAACGCTGGGTTAAGCGGGAGATGCACGTTTGGCGCGAAGTCTTCTACCGGCTGGAGCGTTGGGCTGATCGGCTCGAATCTGGTGGGGGTAAGTATCCAGTTGGGTCCGACCCTGCTCGCCACACAGTCTCAGTTGGCGTAGGTGGGCCGGAGTCGTATTGCCAAGATGCGGACAACTATGATTATACAGTTTCCCCATACGCGATCACACCACCGCCGGCAGCAGGGCAGCTGCCAGGTCAGGAAGAGGTTGAGGCCCAGCAGTATCCACCATGGGCCCCAGGGGAAGACGGCCAGCTTTCTCCTGGGGTGGACACTCAAGTTTTTGAAGATCCGCGTGAATTTCTGCGGCATTTAGAAGATTATCTCCGCCAGGTCGGGGGGTCTGAAGAGTATTGGTTAAGCCAAATTCAAAACCATATGAACGGCCCGGCCAAGAAGTGGTGGGAGTACAAGCAAGGGTCTGTGAAAAATTGGGTGGAGTTTAAGAAAGAATTCTTGCAATATTCTGAGGGCACTCTTTCGCGTGAAGCCATCCAACGCGAACTCGACTTACCGCAGAAACAAGGGGAACCTCTCGACCAATTTCTGTGGCGCAAACGCGACCTGTACCAGACTCTTTACGTCGATGCTGAGGAGGAAGAAATTATTCAATACGTAGTTGGCACACTGCAGCCTAAGCTTAAACGGTTTTTACGTCCACCATTGCCGAAGACGCTTGAACAACTCATCCAGAAGGGTATGGAGGTTCAAGATGGTCTGGAACAGGCAGCGGAACCAGCGGCGGAGGAGGCAGAAGCCCTGACACCTGCGTTAACTAACGAGTCTGTCGCGAGCGACCGCACCCAGCCGGAA SEQ ID NO: 32GGGGAATTAGACCGCCTGAACCCAAGCTCAGGCCTGCATCCATCCTCTGGTTTGCATCCATACCCAGGTCTCCGGGGCGGGGCAACCGCGAAGCCTAATGTCATTTTGCAAATTGGCAAATGCCGTGCGGAAATGCTTGAACACGTCCGCAAAACTCACCGTCATCTCCTCACAGAAGTATCGCGCCAAGTAGAACGCGAGCTCAAAGGCCTTCACAAAAGTGTTGGCAAGTTGGAATCAAATCTTGATGGGTACGTACCGTCAAGCGACTCCCAACGCTGGAAGAAAAGCATTAAGGCGTGCTTATCCCGTTGCCAAGAGACGATTGCGCATTTAGAACGCTGGGTTAAACGTGAAATGAATGTATGGCGTGAGGTGTTCTACCGTTTGGAACGTTGGGCGGACCGTCTGGAGGCTATGGGCGGTAAGTATCCTGCCGGTGAGCAGGCCCGGCGTACAGTTTCAGTGGGCGTTGGGGGCCCTGAGACATGTTGTCCAGGGGATGAAAGTTATGATTGTCCGATTTCTCCGTATGCAGTTCCACCTTCCACCGGCGAGTCTCCGGAATCCTTAGACCAAGGGGATCAGCACTATCAGCAGTGGTTTGCCCTCCCGGAGGAGTCCCCTGTTAGCCCTGGGGTTGATACCCAGATCTTTGAAGATCCTCGCGAGTTTTTACGTCATCTGGAGAAGTACCTGAAACAAGTCGGCGGGACAGAGGAAGACTGGCTTTCTCAAATCCAGAATCACATGAATGGGCCGGCGAAGAAGTGGTGGGAGTACAAGCAAGGGAGTGTTAAGAATTGGCTTGAATTTAAGAAGGAATTTTTACAGTATTCGGAGGGCACACTGACGCGGGACGCGTTGAAACGTGAACTGGATCTCCCACAGAAACAAGGCGAACCACTTGATCAATTTTTATGGCGGAAGCGCGACTTATATCAGACACTCTACGTTGACGCCGATGAAGAGGAAATCATTCAGTACGTCGTGGGCACTCTTCAGCCGAAATTAAAACGCTTTCTCCATCACCCACTCCCTAAGACGCTTGAGCAGCTTATCCAACGGGGCCAAGAAGTTCAGAATGGTCTGGAGCCTACCGACGATCCTGCAGGCCAACGCACTCAATCGGAGGACAACGACGAAAGCCTTACCCCTGCCGTCACCAATGAGAGTACTGCAAGCGAGGGCACCCTGCCAGA GSEQ ID NO: 33GGGCAGCTTGATAACGTTACAAACGCGGGCATCCACTCCTTCCAGGGGCATCGTGGCGTAGCGAATAAGCCAAATGTCATTCTGCAAATTGGTAAATGTCGTGCGGAAATGCTGGAGCACGTTCGCCGCACCCACCGCCATTTATTATCTGAAGTATCTAAGCAGGTAGAACGTGAGCTGAAAGGGCTGCAAAAGTCCGTGGGCAAGCTCGAGAATAACTTGGAGGATCATGTCCCTACAGATAACCAACGCTGGAAGAAGTCCATTAAAGCGTGCTTGGCTCGTTGTCAAGAGACTATCGCGCATTTAGAGCGTTGGGTGAAACGCGAAATGAACGTCTGGAAGGAGGTGTTTTTCCGGCTGGAAAAGTGGGCAGACCGGCTGGAGTCAATGGGTGGCAAGTACTGCCCGGGCGAACACGGGAAACAAACCGTCAGTGTAGGCGTGGGGGGTCCTGAAATCCGGCCTTCGGAGGGGGAAATTTATGATTATGCTCTGGATATGAGCCAGATGTATGCACTCACCCCACCTCCAGGCGAAATGCCATCAATCCCACAAGCCCATGACAGCTATCAGTGGGTTAGTGTCTCAGAAGATGCCCCGGCGAGCCCTGTCGAAACCCAGGTATTTGAGGACCCTCGGGAATTCCTGTCTCACCTGGAGGAATACCTGAAGCAGGTAGGCGGCACGGAGGAGTATTGGTTGTCCCAGATCCAGAATCACATGAATGGTCCGGCAAAAAAATGGTGGGAATATAAACAGGACTCCGTTAAAAACTGGGTTGAGTTTAAAAAGGAATTCTTGCAATACTCTGAAGGTACTTTAACTCGGGATGCTATTAAGCGTGAACTCGACTTGCCGCAAAAGGAAGGTGAACCTCTTGACCAATTCCTTTGGCGGAAGCGGGACCTCTATCAGACACTTTACGTGGACGCGGATGAGGAGGAGATCATTCAGTATGTGGTCGGTACCCTGCAGCCGAAGCTCAAGCGTTTCCTGAGCTATCCTCTCCCAAAGACTTTAGAACAGCTCATCCAGCGCGGTAAAGAAGTGCAGGGTAACATGGATCACTCCGATGAGCCTTCGCCGCAGCGTACACCTGAAATTCAATCAGGTGACTCCGTAGAATCTATGCCACCTTCAACAACGGCATCTCCGGTTCCATCTAATGGTACCCAACCTGAGCCGCCGAGCCCGCCAGCCACCGTTATC SEQ ID NO: 34GGGCAACTTGACAACGTAACAAACGCTGGGATTCACTCCTTTCAGGGCCACCGCGGTGTCGCCAACAAGCCAAACGTAATCTTGCAAATTGGCAAATGCCGTGCGGAGATGTTGGAACACGTTCGTCGTACACATCGTCACTTGCTGTCGGAAGTCTCTAAACAAGTAGAACGTGAACTTAAAGGGCTTCAAAAGTCAGTCGGCAAATTGGAAAACAACCTTGAAGACCATGTACCAACCGACAATCAGCGTTGGAAAAAGTCTATCAAAGCTTGCCTGGCCCGTTGTCAAGAGACGATTGCTCACCTGGAGCGGTGGGTAAAGCGCGAGATGAATGTGTGGAAAGAGGTCTTCTTCCGCTTGGAAAAATGGGCCGACCGTTTGGAGTCCATGGGCGGTAAATATTGTCCGGGTGAACATGGTAAGCAAACAGTCTCTGTGGGCGTTGGTGGGCCGGAGATTCGGCCTTCTGAAGGCGAGATTTACGATTATGCGCTCGACATGTCCCAGATGTATGCGCTTACACCACCACCGGGCGAGGTACCAAGCATTCCTCAAGCGCATGACAGTTATCAGTGGGTTAGCGTATCCGAAGACGCTCCTGCCTCGCCGGTAGAGACCCAGGTTTTTGAAGATCCTCGTGAATTTTTAAGCCACTTGGAGGAGTATTTGAAGCAGGTAGGGGGGACAGAGGAATATTGGCTGTCTCAGATCCAGAACCACATGAATGGCCCGGCTAAAAAGTGGTGGGAATACAAACAAGATTCGGTAAAGAATTGGGTAGAATTTAAAAAGGAGTTTTTACAGTACTCAGAGGGGACTCTCACGCGTGATGCGATCAAACGCGAGTTGGATCTTCCTCAAAAAGAGGGGGAGCCACTCGATCAGTTCCTCTGGCGCAAGCGGGATCTCTACCAAACACTCTACGTAGACGCAGACGAAGAAGAGATCATCCAGTACGTGGTGGGTACGCTCCAGCCGAAACTCAAACGTTTCCTCAGCTACCCACTTCCTAAGACTCTGGAACAACTGATTCAGCGGGGCAAAGAGGTCCAGGGTAACATGGACCATTCAGAGGAACCTAGTCCGCAACGTACACCTGAGATCCAATCTGGGGATTCTGTCGATTCGGTTCCACCTTCTACAACAGCGTCTCCGGTGCCGTCAAATGGGACCCAACCAGAG SEQ ID NO: 35GGGCAGCTTGATAATGTAACCAATGCAGGTATCCACTCTTTCCAGGGTCACCGCGGTGTGGCAAACAAGCCAAATGTTATTCTGCAAATTGGTAAGTGTCGCGCTGAGATGTTAGAACACGTCCGGCGCACGCATCGGCATCTCCTGTCAGAGGTTTCAAAGCAGGTAGAGCGTGAATTAAAGGGCCTCCAGAAGTCCGTAGGTAAACTCGAAAATAATCTTGAAGACCACGTTCCTACCGATAATCAACGGTGGAAAAAGTCAATCAAGGCGTGCTTAGCACGGTGTCAGGAAACGATCGCGCACCTCGAACGTTGGGTGAAGCGCGAAATGAATGTCTGGAAAGAAGTGTTCTTCCGGCTTGAGAAGTGGGCTGATCGGCTCGAATCCATGGGTGGCAAATATTGTCCAGGTGATCATGGCAAGCAAACGGTCTCCGTCGGTGTTGGTGGTCCGGAAATCCGGCCGAGCGAGGGTGAAATCTATGACTACGCTCTTGATATGTCCCAGATGTATGCACTCACTCCTCCGCCGGGTGAGGTCCCGTCGATCCCGCAGGCGCATGACTCATACCAATGGGTGTCGACTAGCGAAGACGCACCAGCCTCCCCTGTTGAAACTCAAGTATTCGAGGACCCGCGTGAGTTCCTGAGCCATTTAGAGGAGTACCTTAAGCAGGTTGGTGGTACCGAGGAATACTGGTTGAGCCAGATTCAGAATCACATGAACGGGCCGGCTAAGAAATGGTGGGAATACAAGCAGGATTCAGTCAAGAATTGGGTCGAATTTAAGAAGGAGTTTTTGCAGTACAGTGAGGGGACGCTCACACGCGACGCTATCAAACGGGAGCTGGACCTGCCACAAAAGGAGGGTGAACCGCTTGATCAGTTTCTTTGGCGCAAGCGTGATCTGTATCAAACCCTGTATGTGGACGCTGACGAAGAAGAGATCATTCAGTACGTGGTTGGGACTCTGCAACCAAAGCTGAAGCGTTTTCTTTCTTATCCTCTCCCTAAGACACTGGAACAGTTAATCCAACGTGGCAAGGAGGTCCAGGGTAATATGGACCACTCTGAGGAACCGAGCCCGCAACGTACTCCTGAAATTCAGAGCGGGGATAGTGTCGACTCAGTTCCTCCAAGTACGACCGCATCCCCGGTCCCAAGTAACGGTACCCAACCAGAG SEQ ID NO: 36GGGTCTTGGGGCTTGCAACGTCACGTGGCTGATGAACGTCGTGGCCTCGCTACGCCTACCTACGGCGCGGTTTGTTCCATTCGGGAGAAAAAAGCCTCCCAACTGAGCGGCCAGAGCTGTTTGGAGAAAGAGTTGCTTGGTTGGAAATGTACGGAGGCAATCGTGGAAATGATGCAAGTCGATAACTTTAACCACGGTAACTTACATAGCTGCCAAGGCCATCGGGGGATGGCAAATCACAAACCGAACGTAATCCTTCAAATCGGGAAATGTCGCGCAGAAATGTTAGACCACGTGCGTCGCACCCACCGCCATCTCTTGACGGAGGTTTCGAAGCAGGTAGAACGCGAATTGAAGTCTCTCCAAAAGTCGGTTGGCAAGCTCGAGAATAATCTGGAAGACCACGTGCCATCGGCAGCGGAGAACCAACGTTGGAAGAAATCAATTAAAGCCTGCCTGGCCCGGTGCCAAGAAACAATTGCTCACCTCGAACGCTGGGTTAAACGCGAAATCAACGTCTGGAAAGAAGTATTCTTTCGTCTGGAGAAGTGGGCGGACCGCCTTGAGTCGGGTGGGGGCAAGTATGGGCCTGGTGACCAAAGTCGTCAAACTGTAAGTGTCGGTGTTGGGGCCCCAGAAATCCAACCGCGGAAAGAAGAAATCTATGACTACGCTCTCGACATGTCGCAGATGTATGCCTTAACACCACCGCCGATGGGTGAAGACCCAAACGTACCTCAATCCCACGATAGCTACCAGTGGATTACCATCTCAGACGATTCACCTCCGTCGCCAGTGGAAACTCAAATTTTCGAGGATCCACGCGAATTCCTTACCCATCTCGAGGATTATCTTAAGCAAGTGGGCGGGACTGAAGAATATTGGTTGAGTCAGATTCAAAATCATATGAACGGTCCGGCCAAGAAATGGTGGGAGTACAAACAAGATTCCGTGAAAAACTGGTTGGAATTCAAGAAGGAATTCCTTCAATACTCTGAGGGTACTTTGACACGTGACGCAATTAAACAAGAACTTGACTTACCGCAGAAGGACGGCGAGCCATTGGATCAATTTCTTTGGCGGAAGCGGGACCTGTATCAGACGCTCTATATTGATGCAGAGGAGGAAGAAGTAATCCAATACGTTGTTGGCACACTCCAACCGAAATTAAAACGTTTCCTTTCCCACCCGTATCCGAAAACTTTGGAACAGTTAATCCAACGTGGGAAAGAGGTGGAAGGCAACCTCGATAACTCTGAGGAGCCTAGCCCGCAACGGAGTCCAAAGCACCAATTGGGTGGTAGCGTCGAGAGCCTCCCACCTTCGTCGACCGCAAGTCCTGTTGCGTCAGACGAGACTCACCCAGACGTGAGCGCACCTCCGGTAACGGTGATT SEQ ID NO: 37GGGGACGGCGAGACTCAAGCTGAGAATCCATCTACCAGCTTGAACAACACTGACGAAGATATCTTGGAACAGCTCAAGAAAATTGTCATGGATCAACAACACCTGTATCAGAAAGAATTAAAGGCATCTTTTGAACAACTCAGTCGCAAAATGTTTTCCCAGATGGAACAAATGAATAGCAAGCAAACGGATCTGCTTTTAGAACATCAAAAACAGACTGTCAAACATGTAGACAAGCGCGTGGAGTATTTGCGGGCGCAATTCGATGCATCGTTAGGCTGGCGGTTGAAAGAGCAACACGCGGATATTACGACCAAAATCATTCCTGAGATCATCCAAACGGTGAAGGAAGATATTAGCCTGTGTCTTTCTACGCTCTGCAGTATCGCTGAAGATATCCAGACATCACGGGCTACCACTGTCACAGGGCATGCTGCCGTACAAACCCATCCTGTGGATCTTTTGGGTGAACACCATTTAGGGACCACGGGGCACCCACGCTTACAGTCGACCCGTGTAGGGAAACCAGACGACGTACCTGAGTCGCCGGTAAGCCTGTTTATGCAAGGTGAGGCGCGTTCCCGGATCGTTGGCAAGAGTCCGATTAAACTGCAATTTCCGACGTTCGGCAAAGCAAACGATTCTTCCGACCCACTCCAATATCTGGAGCGGTGTGAGGACTTTCTTGCTCTTAACCCTTTAACTGATGAGGAACTTATGGCTACTTTGCGGAATGTGTTACATGGCACCTCTCGGGATTGGTGGGATGTCGCACGTCATAAAATCCAAACTTGGCGTGAGTTTAATAAACACTTCCGGGCGGCTTTCCTCAGCGAGGATTATGAAGATGAGTTGGCTGAGCGCGTCCGTAACCGCATCCAAAAAGAAGATGAGTCTATCCGCGATTTCGCTTATATGTATCAGTCCTTGTGCAAGCGGTGGAACCCTGCTATCTGCGAAGGTGATGTAGTAAAGCTCATCCTGAAGAACATCAATCCACAACTGCCGTCTCAGTTACGCTCCCGGGTCACGACCGTGGATGAGCTTGTTCGCTTGGGCCAGCAGCTTGAAAAAGATCGTCAGAATCAGCTCCAATATGAGCTTCGGAAGAGTTCCGGCAAAATTATCCAAAAATCTAGTTCGTGCGAAACTTCAGCGCTCCCGAACACGAAGAGTACACCTAATCAACAAAACCCTGCTACCAGTAACCGTCCTCCACAGGTGTATTGCTGGCGGTGTAAGGGTCACCATGCCCCTGCCTCTTGTCCGCAATGGAAAGCTGATAAGCACCGTGCGCAACCTTCGCGGAGTTCTGGGCCACAAACTCTGACTAATCTCCAAGCTCAAGACATC SEQ ID NO: 38GGGGAATTGGATCAACGTGCGGCAGGGGGCTTGCGCGCGTACCCGGCGCCGCGTGGTGGTCCAGTTGCCAAACCGAGCGTAATTCTTCAGATTGGTAAGTGCCGCGCTGAGATGCTGGAACACGTCCGCCGCACGCATCGCCATCTTCTGACGGAGGTAAGTAAACAAGTGGAGCGCGAACTCAAGGGGTTACATCGGTCTGTCGGTAAGTTGGAGGGCAATTTAGACGGCTATGTGCCTACCGGTGATTCCCAACGCTGGAAAAAAAGTATCAAGGCGTGTCTCTGCCGGTGTCAGGAAACAATTGCAAATCTCGAGCGTTGGGTGAAACGTGAGATGCATGTTTGGCGTGAGGTATTCTATCGTTTGGAACGGTGGGCAGACCGTTTGGAGTCTATGGGGGGCAAGTATCCGGTGGGCACTAACCCGTCGCGGCACACAGTAAGTGTCGGGGTAGGGGGCCCGGAAGGCTATTCTCATGAAGCGGATACTTATGACTACACGGTGTCTCCGTATGCTATCACGCCACCGCCTGCCGCGGGTGAGTTGCCTGGTCAAGAGGCTGTCGAGGCACAACAGTACCCTCCATGGGGTCTGGGGGAGGACGGGCAACCAGGTCCGGGCGTGGACACGCAGATTTTTGAGGACCCTCGCGAATTTTTGAGCCACTTAGAGGAGTACCTGCGGCAAGTAGGGGGGAGTGAAGAGTACTGGTTATCGCAAATTCAAAATCATATGAATGGCCCTGCGAAGAAATGGTGGGAGTTCAAACAGGGGTCAGTCAAGAATTGGGTCGAGTTTAAGAAAGAATTTTTGCAATACAGTGAGGGTACGTTGAGTCGCGAGGCCATCCAACGTGAACTGGACCTCCCTCAGAAGCAGGGGGAGCCGTTAGATCAATTTTTATGGCGGAAACGTGACTTATACCAAACCCTCTACGTTGACGCTGAGGAAGAAGAAATTATTCAATATGTTGTCGGTACGCTGCAGCCAAAGCTGAAGCGGTTCCTCCGTCCTCCACTCCCTAAAACCTTAGAACAATTAATCCAAAAAGGCATGGAAGTTCAGGACGGGTTAGAACAAGCGGCCGAACCGGCCTCTCCGCGTCTGCCGCCGGAAGAGGAGAGTGAGGCTCTTACGCCTGCGCTCACGAGCGAATCAGTAGCCTCCGATCGGACACAGCCAGAGSEQ ID NO: 39GGGCAGCTTGACAATGTGACGAACGCGGGGATTCACAGCTTTCAAGGGCACCGCGGCGTCGCCAACAAACCGAATGTCATTCTGCAAATCGGTAAATGTCGTGCTGAAATGCTTGAGCACGTTCGTCGTACCCATCGTCACTTGCTTTCTGAAGTATCAAAACAAGTGGAGCGGGAACTCAAAGGCCTGCAAAAGTCAGTGGGTAAATTGGAGAATAACCTCGAAGACCATGTACCTACAGACAACCAGCGGTGGAAAAAATCTATCAAGGCATGCCTCGCTCGTTGCCAGGAGACTATTGCCCATCTTGAGCGGTGGGTGAAACGTGAAATGAACGTATGGAAGGAAGTATTTTTTCGCTTAGAGAAGTGGGCTGATCGTCTTGAATCGATGGGCGGCAAGTACTGTCCTGGGGAACACGGCAAACAAACTGTATCTGTCGGCGTGGGGGGCCCGGAGATCCGGCCATCGGAAGGGGAAATTTATGATTATGCTCTCGACATGTCCCAAATGTATGCTCTCACACCAGGGCCAGGGGAAGTACCGTCAATTCCGCAAGCACACGACAGCTACCAATGGGTATCTGTGAGCGAGGACGCGCCTGCCTCTCCGGTTGAGACGCAAATCTTTGAGGACCCACATGAATTTTTGTCTCATCTTGAAGAATATCTCAAACAGGTTGGCGGCACAGAAGAATACTGGTTATCTCAGATCCAGAATCACATGAACGGCCCGGCTAAAAAGTGGTGGGAGTATAAGCAAGATTCCGTAAAGAACTGGGTCGAATTCAAGAAAGAGTTTCTTCAATACTCTGAGGGTACTCTGACGCGCGATGCAATTAAGCGGGAGTTAGACCTTCCACAAAAAGAGGGGGAGCCTCTTGACCAGTTCCTGTGGCGTAAGCGCGACCTCTATCAGACACTTTACGTCGACGCTGATGAAGAAGAGATTATTCAATATGTTGTGGGTACCCTGCAGCCAAAGCTTAAGCGTTTCCTTAGCTACCCACTTCCGAAAACTCTGGAGCAGCTCATTCAACGCGGTAAGGAAGTGCAGGGCAACATGGACCACTCTGAAGAGCCTAGCCCGCAGCGCACTCCTGAAATCCAATCAGGTGACAGTGTGGAGTCAATGCCGCCGTCAACCACCGCTTCTCCGGTACCTAGCAACGGGACGCAACCAGAGCCTCCAAGCCCACCGGCTACAGTCATC SEQ ID NO: 40GGGCAACTTGAGAATATTAACCAAGGTTCCCTGCACGCGTTTCAGGGTCATCGCGGCGTGGTCCATAACAACAAGCCTAACGTTATTCTCCAGATCGGGAAGTGCCGCGCCGAAATGCTGGAGCATGTGCGGCGCACCCATCGCCATTTGCTCACTGAAGTATCAAAACAGGTGGAGCGTGAGTTGAAGGGGTTGCAGAAAAGTGTAGGCAAACTTGAAAATAATTTAGAAGACCACGTACCAAGTGCGGCTGAGAACCAACGCTGGAAGAAGTCGATTAAAGCCTGCTTAGCGCGTTGTCAGGAGACCATTGCGAACTTGGAACGCTGGGTTAAACGTGAGATGAATGTTTGGAAGGAGGTCTTTTTCCGCTTAGAGCGCTGGGCAGATCGCCTCGAATCCGGGGGTGGCAAGTACTGCCATGCAGACCAGGGTCGCCAAACTGTCAGCGTAGGTGTTGGTGGTCCTGAAGTGCGTCCGTCTGAAGGTGAAATTTACGATTACGCGTTGGATATGAGCCAAATGTACGCCTTGACTCCGCCGCCTATGGGTGATGTTCCAGTAATTCCTCAGCCGCATGACAGTTATCAGTGGGTGACAGATCCGGAAGAAGCGCCACCAAGTCCGGTTGAGACACAAATTTTCGAGGACCCTCGGGAGTTTCTGACCCATCTTGAGGATTATTTAAAACAAGTCGGCGGGACAGAGGAATATTGGCTCTCACAGATCCAAAATCATATGAATGGGCCAGCGAAAAAGTGGTGGGAATATAAACAGGATAGTGTGAAGAACTGGCTTGAGTTCAAAAAAGAATTCTTGCAGTACTCAGAAGGCACGTTAACGCGGGACGCTATTAAACAGGAACTTGACCTTCCACAAAAAGAAGGGGAACCGCTGGATCAATTCCTCTGGCGCAAACGCGATTTGTACCAAACTCTCTACGTCGAGGCAGAAGAAGAGGAGGTCATCCAATATGTAGTTGGCACACTGCAACCAAAACTGAAGCGGTTTCTTTCTCATCCGTACCCTAAAACCCTGGAGCAACTCATCCAGCGCGGGAAGGAAGTTGAGGGGAATTTGGACAATAGTGAAGAACCGTCTCCACAGCGGACCCCAGAACATCAGCTGGGGGACAGTGTGGAATCTTTGCCGCCTAGTACTACGGCTTCGCCTGCCGGTTCGGATAAAACGCAACCTGAGATTAGCTTACCTCCAACTACAGTCATT SEQ ID NO: 41GGGCAATTAGATTCGGTAACCAATGCGGGCGTCCACACCTACCAGGGCCATCGGAGCGTCGCCAATAAACCTAACGTCATTCTTCAAATCGGGAAATGTCGGACTGAGATGCTGGAGCATGTCCGTCGGACTCATCGCCACCTGCTCACAGAAGTGTCAAAGCAAGTGGAACGTGAACTCAAGGGCTTACAGAAGAGCGTGGGCAAACTGGAAAACAATCTTGAAGACCATGTCCCAACTGACAATCAGCGGTGGAAGAAGTCAATCAAGGCATGTCTCGCGCGTTGCCAAGAGACCATTGCTCACCTTGAGCGGTGGGTGAAACGTGAAATGAACGTGTGGAAGGAGGTGTTCTTCCGGTTAGAACGCTGGGCCGACCGCCTTGAATCAATGGGTGGTAAATACTGCCCGACGGACTCTGCACGTCAGACAGTTAGCGTTGGGGTGGGGGGCCCGGAAATTCGGCCTAGTGAAGGCGAAATCTATGACTACGCGCTCGATATGAGCCAAATGTACGCTCTTACGCCGTCACCGGGCGAATTGCCGTCCGTCCCTCAACCGCATGATTCATACCAGTGGGTCACTAGTCCGGAAGACGCTCCGGCGTCACCAGTTGAAACGCAGGTATTCGAGGATCCTCGGGAGTTCTTGTGTCATTTGGAAGAGTACCTGAAGCAGGTTGGCGGTACAGAGGAATATTGGCTGAGCCAGATTCAGAATCATATGAATGGTCCTGCAAAAAAGTGGTGGGAATATAAACAAGACACGGTTAAGAATTGGGTGGAATTCAAGAAGGAGTTCTTACAATACAGTGAGGGTACACTTACCCGTGATGCGATTAAGCGGGAATTAGACCTCCCGCAAAAGGACGGTGAGCCTCTGGATCAATTTTTATGGCGTAAGCGTGACCTCTATCAGACATTATACATTGATGCCGATGAAGAACAGATCATTCAGTACGTCGTGGGGACATTGCAACCTAAACTCAAGCGGTTCTTGTCCTATCCACTTCCAAAAACTCTTGAACAATTAATCCAGAAAGGGAAGGAGGTGCAGGGTTCACTTGACCACAGCGAGGAGCCGAGTCCTCAACGTGCGAGCGAGGCTCGGACGGGCGATAGTGTGGAAACCTTGCCGCCTTCTACCACTACATCACCAAATACGTCATCTGGTACACAGCCAGAGGCACCATCGCCTCCAGCGACGGTAATC SEQ ID NO: 42GGGCAGTTAGACAGTGTGACTAACGCCGGGGTGCATACGTACCAGGGGCACCGCGGGGTCGCCAATAAGCCAAATGTAATTCTCCAGATTGGGAAGTGTCGTACAGAGATGTTGGAACATGTCCGTCGCACTCATCGCCACTTGCTCACCGAGGTCTCCAAACAAGTAGAACGCGAACTCAAGGGGCTCCAGAAGAGTGTTGGGAAGTTGGAGAATAACCTCGAAGACCACGTTCCGACAGATAACCAACGGTGGAAAAAGTCTATTAAAGCCTGTCTCGCCCGTTGTCAAGAGACAATCGCACACTTGGAACGCTGGGTCAAACGGGAGATGAATGTGTGGAAGGAAGTCTTCTTCCGTCTCGAGCGGTGGGCGGATCGTTTAGAAAGTATGGGCGGTAAATATTGCCCAACTGACTCGGCTCGTCAAACGGTGTCGGTTGGCGTAGGCGGCCCGGAAATTCGCCCTAGCGAGGGTGAGATCTATGACTATGCACTTGACATGAGTCAGATGTATGCGTTAACTCCGTCGCCAGGGGAGCTTCCAAGTATTCCACAGCCTCACGATAGTTATCAATGGGTAACTTCTCCTGAAGACGCCCCAGCATCCCCAGTTGAGACACAAGTATTCGAGGACCCTCGTGAGTTTCTCTGTCACCTCGAGGAGTACCTTAAACAGGTAGGCGGGACCGAAGAGTACTGGTTATCGCAAATCCAAAACCATATGAATGGTCCTGCCAAAAAGTGGTGGGAGTATAAACAAGATACTGTGAAGAATTGGGTAGAGTTCAAGAAAGAGTTCTTACAGTACTCTGAGGGGACGTTAACTCGTGATGCGATCAAGCGCGAATTGGATTTACCTCAGAAGGACGGCGAGCCACTCGACCAGTTCTTATGGCGCAAGCGTGACTTGTATCAAACCCTTTATATCGATGCTGACGAGGAACAAATTATCCAGTACGTAGTCGGTACGTTGCAACCAAAACTTAAACGCTTTCTGAGCTACCCATTACCTAAAACGTTGGAGCAACTGATCCAGAAAGGTAAAGAGGTGCAAGGGAGCCTGGATCATAGTGAAGAACCGAGCCCTCAGCGGGCTTCTGAAGCTCGGACCGGTGATAGCGTCGAATCTTTACCACCTAGTACCACAACCAGCCCGAATGCGTCATCTGGTACCCAACCTGAAGCGCCTTCCCCACCTGCTACAGTCATT SEQ ID NO: 43GGGCAGCTCGAGAATGTCAACCATGGGAACCTCCATTCTTTTCAAGGTCATCGCGGCGGCGTCGCCAACAAGCCAAACGTTATCTTGCAGATCGGTAAATGTCGTGCAGAGATGCTGGACCACGTCCGGCGGACCCACCGGCATTTACTGACAGAGGTATCGAAACAGGTTGAACGTGAGTTGAAGGGGTTACAGAAATCAGTAGGGAAATTAGAAAATAACTTAGAAGACCATGTCCCTTCAGCCGTTGAAAACCAGCGTTGGAAAAAATCGATCAAGGCCTGCCTTTCCCGCTGCCAAGAGACCATTGCCCACCTTGAGCGTTGGGTGAAGCGCGAGATGAACGTATGGAAAGAGGTTTTCTTCCGCTTAGAGCGGTGGGCAGATCGGTTGGAATCTGGGGGCGGGAAATATTGTCACGGTGATAATCATCGTCAAACAGTATCAGTCGGTGTTGGCGGCCCTGAGGTACGTCCATCTGAAGGCGAAATTTACGATTACGCTCTCGACATGTCGCAAATGTACGCTTTAACACCGCCTAGCCCAGGGGATGTGCCTGTAGTTAGCCAGCCGCACGACAGCTATCAGTGGGTTACGGTTCCGGAGGATACCCCTCCATCCCCGGTGGAGACGCAAATCTTCGAGGACCCACGGGAGTTCTTGACCCACTTAGAGGATTACTTAAAGCAAGTGGGGGGTACAGAGGAATATTGGTTATCTCAGATCCAGAATCACATGAACGGGCCAGCCAAGAAGTGGTGGGAGTATAAGCAAGACTCAGTAAAAAATTGGCTCGAGTTTAAGAAGGAATTCCTTCAGTATTCCGAGGGGACACTTACGCGCGACGCTATCAAGGAAGAACTTGACCTCCCGCAAAAGGACGGGGAACCTCTTGATCAGTTCCTGTGGCGCAAGCGCGACTTGTACCAGACCCTGTACGTGGAGGCGGATGAGGAGGAGGTGATCCAGTATGTTGTGGGGACTTTACAACCTAAATTAAAGCGTTTTCTCTCACACCCTTACCCGAAAACGTTAGAGCAACTTATCCAACGGGGCAAAGAGGTGGAAGGGAACCTCGACAATTCAGAGGAACCAACACCTCAGCGTACTCCAGAACACCAACTGTGTGGTTCTGTAGAATCGCTGCCTCCTTCCTCTACCGTCAGTCCAGTGGCTAGCGATGGTACTCAACCTGAGACTTCGCCATTGCCAGCGACTGTTATT SEQ ID NO: 44GGGCCATTGACGTTGTTACAAGACTGGTGTCGTGGTGAACATTTAAACACCCGCCGGTGCATGTTGATCCTCGGTATCCCAGAAGATTGCGGCGAGGATGAGTTCGAAGAGACACTTCAGGAGGCGTGTCGCCATTTAGGGCGGTACCGCGTGATCGGCCGCATGTTCCGTCGTGAGGAAAATGCCCAAGCGATCCTCTTGGAATTGGCGCAGGATATTGACTATGCCTTACTCCCTCGGGAAATCCCTGGGAAAGGCGGGCCTTGGGAGGTAATTGTGAAGCCGCGTAATTCCGACGGCGAATTCTTAAATCGGCTTAATCGCTTTCTTGAAGAGGAGCGCCGTACGGTCTCCGATATGAACCGTGTTTTGGGCTCGGATACTAACTGTTCAGCTCCTCGTGTCACCATTAGTCCTGAATTCTGGACTTGGGCACAGACGCTGGGCGCAGCTGTCCAACCATTGCTCGAACAGATGCTCTACCGGGAGTTACGGGTCTTCAGTGGCAATACGATTTCCATCCCAGGTGCTCTCGCTTTTGACGCGTGGCTGGAGCATACCACGGAAATGCTTCAAATGTGGCAGGTGCCTGAAGGGGAGAAACGGCGGCGCTTGATGGAGTGTTTGCGGGGGCCAGCCCTGCAAGTCGTTAGTGGGTTACGTGCATCGAATGCCAGTATCACTGTCGAAGAGTGTCTTGCTGCACTGCAGCAGGTATTCGGTCCAGTGGAAAGTCATAAGATTGCCCAAGTAAAGTTATGCAAAGCTTACCAGGAGGCTGGGGAAAAAGTAAGCAGCTTCGTTTTGCGTTTGGAGCCACTGCTTCAGCGTGCTGTAGAAAACAACGTGGTCAGTCGCCGCAATGTCAACCAAACACGTCTTAAGCGTGTTCTGTCGGGCGCCACCCTTCCTGACAAGCTGCGTGATAAATTGAAGTTAATGAAACAGCGCCGTAAACCGCCGGGTTTCTTGGCGTTGGTTAAACTGTTACGTGAAGAGGAGGAGTGGGAGGCCACCTTAGGGCCAGACCGCGAGTCATTGGAGGGGTTAGAAGTGGCACCGCGCCCGCCAGCACGGATTACGGGTGTTGGCGCAGTACCTCTTCCGGCATCCGGGAATTCATTTGATGCCCGTCCTTCGCAAGGGTACCGGCGCCGTCGGGGTCGTGGTCAGCACCGTCGGGGCGGCGTTGCTCGTGCAGGCTCTCGTGGCTCTCGTAAGCGGAAACGGCACACCTTCTGCTATTCCTGTGGTGAGGATGGCCATATTCGTGTCCAATGCATTAACCCTAGCAATCTCCTGTTGGCTAAGGAGACCAAAGAGATTTTGGAAGGGGGAGAACGTGAAGCGCAAACGAATTCACGT SEQ ID NO: 45GGGGCTCTTACGCTCTTAGAAGACTGGTGTAAGGGTATGGACATGGACCCGCGGAAGGCTCTCCTGATTGTAGGTATTCCGATGGAATGCAGTGAGGTGGAAATCCAGGATACAGTTAAAGCTGGTCTTCAACCTCTGTGCGCTTATCGTGTACTCGGCCGTATGTTCCGGCGGGAGGATAATGCGAAGGCTGTTTTCATTGAGCTGGCAGACACCGTGAATTACACCACGTTACCGTCTCACATTCCGGGTAAAGGGGGTTCCTGGGAAGTCGTTGTTAAACCTCGGAACCCTGACGACGAGTTCCTTTCTCGGCTTAACTACTTCTTGAAAGATGAGGGCCGCTCGATGACGGATGTCGCCCGGGCACTGGGGTGCTGTAGCTTACCTGCGGAATCACTGGACGCGGAAGTAATGCCACAGGTCCGCTCCCCACCATTAGAACCTCCAAAAGAGAGTATGTGGTACCGTAAGTTAAAAGTGTTTAGTGGTACCGCGTCGCCTTCGCCGGGGGAGGAGACATTTGAGGACTGGTTAGAGCAAGTCACCGAGATCATGCCTATCTGGCAAGTATCTGAAGTTGAAAAGCGCCGTCGGTTACTGGAGTCACTCCGGGGCCCGGCACTCTCAATTATGCGCGTGTTACAAGCCAATAACGATAGCATTACCGTTGAACAGTGTTTGGATGCATTAAAGCAGATCTTTGGCGACAAGGAAGACTTCCGTGCCTCTCAATTTCGTTTTCTTCAAACGTCCCCTAAAATTGGGGAGAAGGTGAGTACGTTCCTGCTGCGTTTAGAGCCACTCTTGCAAAAGGCCGTTCACAAGAGCCCACTTTCGGTACGTAGTACTGATATGATTCGGTTAAAGCACCTGTTGGCACGCGTAGCCATGACCCCGGCACTGCGTGGTAAACTCGAATTACTCGACCAACGCGGGTGCCCACCTAATTTTCTTGAGCTGATGAAGCTGATCCGGGATGAGGAAGAGTGGGAGAATACTGAAGCTGTGATGAAAAATAAAGAGAAACCTTCAGGTCGTGGCCGCGGTGCATCAGGCCGTCAAGCTCGCGCCGAGGCCAGTGTAAGTGCTCCGCAAGCAACAGTCCAAGCACGTAGCTTCTCTGATTCTAGCCCGCAGACGATTCAGGGGGGCTTACCACCTCTTGTCAAGCGTCGGCGCCTTTTGGGTTCGGAGAGCACACGTGGGGAAGACCACGGGCAAGCTACTTATCCGAAAGCAGAGAATCAGACTCCAGGGCGTGAGGGCCCGCAGGCGGCTGGGGAGGAACTTGGTAATGAGGCCGGGGCCGGCGCGATGTCCCACCCGAAACCGTGGGAAAC CSEQ ID NO: 46GGGGCTGTGACAATGCTCCAGGACTGGTGCCGTTGGATGGGCGTGAACGCTCGGCGGGGGCTGTTAATCTTAGGTATCCCTGAAGACTGTGACGATGCAGAGTTCCAAGAGTCGTTAGAAGCTGCACTCCGTCCTATGGGTCACTTTACTGTACTCGGTAAGGCCTTCCGCGAGGAAGACAACGCTACCGCTGCGCTGGTGGAATTAGATCGCGAGGTTAATTACGCACTTGTTCCACGCGAAATTCCGGGCACCGGCGGGCCTTGGAACGTCGTGTTCGTTCCTCGGTGCTCCGGCGAGGAATTCCTGGGGTTAGGCCGCGTGTTCCACTTTCCTGAACAGGAGGGCCAAATGGTAGAATCGGTTGCGGGGGCACTGGGGGTAGGTCTGCGCCGCGTGTGTTGGTTACGCTCGATCGGGCAAGCTGTACAACCATGGGTAGAAGCTGTTCGCTGCCAAAGCTTAGGGGTATTTAGTGGTCGTGATCAACCTGCACCTGGTGAAGAAAGCTTCGAGGTCTGGTTGGATCATACGACCGAGATGTTGCATGTGTGGCAAGGCGTGTCGGAACGGGAACGGCGCCGTCGTCTGCTGGAAGGGCTGCGTGGCACAGCCTTACAACTTGTACATGCCTTACTGGCAGAAAATCCGGCACGGACAGCACAAGATTGCTTGGCTGCATTAGCCCAAGTTTTTGGTGATAACGAAAGCCAGGCAACGATTCGTGTTAAATGTTTGACAGCCCAACAGCAGAGTGGCGAACGCCTCTCTGCGTTCGTTCTCCGCTTAGAAGTACTTCTGCAAAAGGCTATGGAGAAGGAAGCATTGGCGCGCGCGTCAGCGGATCGGGTGCGTCTTCGTCAGATGCTGACACGCGCACATCTCACAGAGCCGTTGGATGAAGCCTTACGGAAATTGCGTATGGCAGGGCGTTCTCCGTCTTTTTTGGAAATGCTCGGCTTAGTACGCGAGTCAGAGGCCTGGGAGGCAAGTCTGGCTCGGTCCGTCCGGGCGCAAACCCAGGAGGGTGCAGGGGCCCGGGCGGGGGCCCAAGCAGTTGCGCGTGCCAGCACTAAGGTTGAAGCTGTACCTGGTGGCCCTGGCCGGGAGCCAGAAGGTCTCCTCCAAGCCGGGGGCCAAGAAGCGGAAGAACTTCTCCAAGAGGGCTTAAAGCCGGTTTTAGAGGAATGTGACAATSEQ ID NO: 47GGGGCGGTCACCATGTTGCAAGACTGGTGTCGGTGGATGGGCGTGAATGCTCGGCGGGGTTTATTGATCTTGGGTATCCCAGAAGACTGTGACGACGCCGAGTTTCAGGAGTCGCTCGAGGCCGCCCTTCGTCCAATGGGGCATTTTACGGTTCTGGGCAAGGTGTTCCGTGAAGAGGATAACGCTACAGCAGCTCTTGTGGAGCTTGACCGTGAGGTGAATTATGCGTTAGTACCTCGCGAGATTCCAGGTACCGGTGGGCCATGGAACGTAGTCTTCGTCCCACGTTGCTCGGGGGAGGAATTTCTGGGGCTTGGGCGCGTATTCCACTTTCCAGAACAGGAAGGGCAGATGGTCGAAAGCGTAGCAGGCGCTCTTGGCGTTGGTCTCCGGCGCGTGTGCTGGTTACGCTCCATCGGCCAAGCAGTCCAACCATGGGTTGAAGCCGTACGCTATCAATCTTTAGGTGTCTTCTCAGGCCGTGACCAGCCGGCGCCTGGTGAGGAATCCTTCGAAGTCTGGCTCGATCATACAACTGAGATGCTGCATGTATGGCAAGGTGTCTCAGAGCGGGAACGGCGGCGGCGGTTATTAGAGGGGCTCCGTGGGACTGCGCTCCAATTAGTACATGCGCTTTTGGCCGAAAATCCAGCCCGTACTGCCCAAGATTGTCTGGCAGCACTCGCCCAAGTATTCGGCGACAACGAATCGCAGGCAACAATCCGCGTAAAGTGTCTTACAGCACAGCAGCAGTCAGGGGAACGTCTTAGTGCGTTCGTTCTGCGGCTGGAAGTGTTACTCCAGAAAGCCATGGAAAAGGAGGCATTGGCTCGCGCGAGCGCTGACCGTGTACGTCTGCGGCAAATGCTTACTCGCGCACATCTCACCGAGCCTCTCGATGAAGCACTGCGGAAACTGCGCATGGCAGGCCGCAGCCCGTCTTTCCTGGAAATGTTAGGCTTAGTCCGGGAGTCCGAAGCCTGGGAGGCCAGTCTGGCACGGTCAGTGCGGGCACAAACGCAAGAGGGTGCAGGGGCACGGGCGGGTGCACAAGCAGTTGCACGTGCCTCCACTAAAGTTGAGGCAGTGCCGGGTGGGCCAGGCCGTGAACCGGAGGGTTTGCGCCAAGCCGGCGGGCAGGAAGCCGAAGAATTACTCCAAGAAGGTTTAAAACCGGTTTTGGAGGAATGCGATAA CSEQ ID NO: 48GGGGTGGAAGATTTGGCGGCATCTTACATCGTATTAAAGCTTGAGAACGAAATCCGGCAGGCGCAGGTCCAATGGTTAATGGAGGAAAACGCCGCCCTGCAGGCCCAGATCCCTGAACTTCAAAAGTCGCAAGCCGCGAAGGAGTATGATCTTCTGCGTAAATCTTCGGAGGCGAAGGAGCCGCAAAAACTGCCAGAACATATGAATCCACCGGCCGCTTGGGAAGCACAAAAGACTCCAGAGTTTAAGGAACCACAGAAACCTCCTGAACCACAGGATTTGCTTCCTTGGGAGCCGCCTGCTGCCTGGGAGTTGCAAGAAGCACCGGCTGCCCCTGAGTCACTGGCTCCGCCTGCAACCCGTGAGTCTCAGAAACCACCTATGGCGCATGAAATCCCTACTGTATTGGAGGGGCAAGGGCCTGCCAACACACAAGACGCTACGATTGCTCAAGAACCAAAGAATAGCGAGCCGCAAGACCCTCCAAATATCGAGAAACCTCAGGAAGCTCCGGAATATCAAGAAACAGCGGCACAGTTGGAGTTTTTAGAACTTCCTCCACCTCAGGAGCCACTCGAACCGAGCAATGCGCAAGAATTTCTCGAGTTGTCGGCTGCCCAGGAGTCCTTAGAAGGCCTCATTGTAGTTGAAACGTCCGCGGCTTCGGAGTTCCCACAGGCTCCTATCGGGCTTGAAGCCACCGACTTTCCGCTGCAGTACACGCTTACCTTCTCTGGCGACAGCCAGAAGTTGCCAGAATTTTTGGTCCAACTCTACAGTTATATGCGGGTACGTGGGCACTTATACCCTACCGAGGCGGCGTTAGTGTCGTTTGTAGGCAATTGTTTCTCAGGGCGCGCGGGCTGGTGGTTTCAGTTGCTTTTGGATATCCAGTCGCCTCTGTTAGAACAGTGTGAAAGTTTTATCCCGGTTCTCCAAGACACATTTGACAATCCGGAAAACATGAAGGACGCAAACCAATGCATCCACCAGCTTTGTCAGGGCGAGGGTCATGTGGCCACACACTTCCACCTCATTGCACAAGAGCTTAATTGGGATGAAAGCACGCTGTGGATCCAGTTCCAGGAAGGCCTGGCCTCATCCATCCAGGATGAACTTTCCCATACATCGCCTGCTACCAACCTGAGTGATCTGATTACTCAATGCATCTCATTAGAGGAAAAGCCTGACCCAAACCCGTTAGGGAAGTCCTCCTCGGCGGAGGGGGATGGCCCGGAAAGTCCGCCAGCAGAAAACCAACCTATGCAAGCTGCGATCAATTGTCCTCACATTTCCGAAGCAGAGTGGGTTCGTTGGCACAAAGGCCGGCTTTGTCTCTATTGCGGCTATCCGGGTCACTTCGCACGTGATTGCCCAGTGAAGCCACACCAGGCGTTACAGGCAGGGAACATTCAGGCTTGCCAA SEQ ID NO: 49GGGGTGCAGCCGCAGACTAGCAAAGCTGAATCGCCGGCTCTCGCTGCCTCACCGAACGCACAAATGGATGACGTTATTGATACATTAACCTCCCTGCGTCTGACGAATTCGGCTCTGCGGCGGGAGGCTAGCACTCTTCGGGCCGAGAAAGCAAATTTAACTAATATGCTCGAGTCAGTGATGGCCGAGTTAACGCTGTTACGGACCCGTGCGCGGATTCCGGGGGCCCTGCAGATTACGCCACCAATTTCGTCTATTACTAGCAACGGTACTCGCCCGATGACGACTCCTCCAACTAGTTTACCTGAACCGTTTTCTGGCGATCCTGGCCGGTTAGCTGGTTTCCTTATGCAGATGGACCGTTTTATGATCTTTCAAGCTAGCCGGTTTCCAGGGGAGGCAGAGCGTGTTGCGTTCCTGGTGTCGCGCTTAACTGGCGAAGCAGAAAAATGGGCCATTCCTCACATGCAACCAGACTCTCCTTTGCGTAACAACTATCAAGGCTTCTTAGCAGAGTTACGGCGGACCTATAAGAGCCCGTTGCGTCACGCCCGGCGGGCGCAAATCCGGAAGACATCGGCCTCGAACCGGGCAGTCCGTGAACGCCAAATGCTTTGCCGGCAACTTGCATCAGCAGGTACAGGCCCATGCCCGGTACACCCTGCTAGTAACGGGACTTCCCCGGCACCGGCATTACCAGCACGGGCGCGTAACTTA SEQ ID NO: 50GGGGACGGTCGGGTACAGTTGATGAAGGCTTTATTGGCTGGCCCTTTACGTCCGGCGGCACGCCGTTGGCGGAATCCTATTCCATTTCCAGAGACTTTTGATGGGGATACTGATCGCCTCCCGGAGTTTATCGTCCAAACTTCGTCCTACATGTTCGTTGACGAAAATACTTTCTCTAACGACGCTCTGAAAGTGACATTTCTCATTACCCGGCTGACAGGTCCAGCCTTGCAATGGGTCATTCCGTACATTCGTAAAGAAAGCCCGCTTCTTAACGACTATCGGGGTTTCCTGGCCGAGATGAAGCGGGTTTTTGGGTGGGAAGAGGACGAGGACTTT SEQ ID NO: 51GGGGAAGGTCGGGTGCAACTTATGAAAGCGTTGCTTGCCCGCCCGCTTCGTCCAGCAGCACGTCGCTGGCGGAATCCAATTCCTTTCCCGGAGACTTTTGACGGGGACACCGATCGGCTCCCAGAGTTCATTGTGCAGACGTCAAGCTATATGTTCGTGGATGAGAACACGTTCTCTAACGACGCGTTGAAAGTGACTTTCTTAATTACGCGTTTGACTGGCCCGGCTTTACAATGGGTGATTCCATACATTAAGAAAGAGTCACCGCTTCTCAGTGATTATCGCGGTTTTTTAGCCGAGATGAAGCGGGTCTTCGGGTGGGAAGAAGACGAAGACTTT SEQ ID NO: 52GGGCCGCGTGGGCGTTGCCGTCAACAAGGTCCTCGGATTCCGATTTGGGCAGCGGCCAACTATGCCAACGCCCACCCGTGGCAACAAATGGATAAGGCTTCGCCAGGCGTTGCTTACACACCTTTGGTTGATCCTTGGATTGAGCGGCCTTGTTGCGGTGACACGGTTTGTGTGCGCACCACAATGGAACAGAAGAGCACAGCGTCAGGCACTTGTGGTGGTAAGCCTGCTGAGCGTGGTCCTCTCGCGGGGCATATGCCGAGCTCACGCCCACATCGGGTTGATTTCTGTTGGGTTCCTGGTAGCGACCCAGGCACATTCGACGGCAGTCCATGGCTCTTAGATCGCTTTTTGGCGCAACTTGGTGATTACATGAGTTTTCACTTTGAACACTACCAGGACAATATCAGCCGTGTCTGCGAGATTCTTCGTCGGTTAACGGGCCGCGCTCAGGCATGGGCTGCTCCTTACCTGGACGGGGACCTTCCACTGCCAGACGACTACGAATTGTTTTGTCAAGACCTTAAGGAGGTAGTACAGGACCCTAACAGTTTCGCCGAGTATCACGCCGTGGTGACTTGTCCACTCCCTCTTGCTTCGTCCCAACTTCCTGTAGCTCCTCAGCTTCCGGTGGTACGCCAATACCTTGCGCGCTTCTTGGAGGGCCTTGCTTTGGATATGGGTACGGCGCCTCGGTCACTCCCGGCCGCTATGGCCACACCGGCAGTCTCCGGCTCGAACTCCGTTTCTCGTTCTGCCTTATTTGAACAACAACTCACAAAGGAATCCACTCCAGGCCCGAAAGAGCCACCTGTTCTCCCTAGCTCGACTTGCTCTAGCAAACCGGGTCCTGTCGAACCAGCCAGTTCACAACCTGAAGAGGCTGCTCCTACCCCGGTGCCGCGTTTGTCAGAGTCGGCTAACCCACCGGCTCAGCGTCCAGACCCTGCTCACCCTGGTGGTCCTAAACCACAAAAAACCGAAGAGGAAGTTTTAGAAACTGAGGGGGACCAGGAAGTTAGCCTGGGGACGCCGCAGGAGGTCGTAGAAGCGCCGGAAACACCAGGTGAACCACCGCTCAGCCCTGGGTTC SEQ ID NO: 53GGGGTTGATGAATTGGTGCTCTTGTTGCACGCGCTGTTAATGCGCCATCGGGCGCTTTCCATTGAAAATTCTCAGTTGATGGAGCAACTTCGCTTGTTGGTCTGCGAACGGGCGAGCCTTCTTCGTCAGGTACGTCCGCCGAGCTGTCCAGTGCCATTTCCTGAGACTTTTAACGGGGAGTCATCACGGTTACCTGAGTTCATCGTCCAAACCGCAAGCTATATGTTAGTTAATGAAAATCGCTTTTGCAATGACGCAATGAAAGTCGCTTTTTTGATTAGCCTTCTTACTGGTGAAGCAGAAGAATGGGTCGTCCCATACATTGAGATGGATTCACCAATTCTTGGGGACTACCGTGCGTTCTTGGATGAGATGAAGCAGTGTTTTGGGTGGGACGATGATGAAGATGACGACGATGAGGAAGAGGAGGATGACTAT SEQ ID NO: 54GGGCCTGTGGATTTAGGTCAGGCTTTGGGGTTGTTGCCATCCCTCGCTAAGGCCGAAGATTCCCAATTTAGCGAAAGCGATGCAGCTTTACAGGAGGAATTGTCTTCTCCGGAAACCGCACGGCAACTTTTTCGTCAATTTCGCTATCAAGTCATGTCGGGGCCTCATGAAACACTGAAACAGTTACGGAAGTTATGTTTTCAGTGGCTGCAACCTGAAGTCCATACAAAGGAACAAATCCTCGAAATTCTGATGCTGGAACAGTTCTTGACCATTCTGCCTGGTGAAATTCAGATGTGGGTCCGCAAGCAGTGCCCTGGTAGTGGGGAGGAGGCGGTTACGTTAGTAGAATCCCTGAAAGGTGATCCACAACGGCTCTGGCAATGGATCTCCATCCAAGTCCTGGGTCAGGATATCCTGTCTGAGAAAATGGAGTCACCTTCTTGCCAGGTGGGCGAAGTGGAGCCACACCTGGAAGTTGTACCTCAGGAACTGGGGTTAGAGAATTCATCTTCAGGGCCGGGGGAACTTCTTTCGCACATCGTGAAAGAGGAGTCTGACACTGAAGCAGAGTTGGCGTTAGCGGCATCCCAGCCAGCTCGTTTGGAAGAACGGCTGATTCGGGATCAGGACCTTGGGGCGTCCCTCCTCCCGGCAGCACCGCAGGAGCAATGGCGTCAATTAGACAGCACTCAAAAAGAACAATATTGGGACCTGATGCTGGAGACCTACGGCAAAATGGTATCCGGCGCGGGTATCTCACACCCGAAGTCCGATTTAACGAACTCAATTGAGTTCGGTGAAGAGTTGGCAGGTATTTATTTACATGTAAACGAAAAGATTCCGCGGCCTACCTGCATTGGTGACCGCCAAGAAAACGACAAAGAAAACCTTAATTTGGAAAACCATCGTGACCAGGAATTATTACATGCCAGCTGCCAGGCCTCGGGCGAAGTGCCATCCCAGGCATCGTTACGTGGCTTCTTTACCGAGGACGAACCTGGTTGCTTCGGCGAAGGGGAGAACCTTCCTGAGGCACTTCAGAATATCCAGGATGAGGGGACTGGCGAACAGCTGAGCCCGCAAGAACGCATTAGTGAAAAACAGTTGGGTCAACATTTGCCAAATCCGCACTCGGGGGAGATGTCGACGATGTGGCTTGAAGAAAAACGGGAGACCAGCCAGAAAGGCCAACCACGTGCACCAATGGCGCAGAAATTGCCAACGTGCCGCGAATGTGGCAAAACGTTTTATCGCAATAGTCAACTTATCTTTCACCAACGCACACACACCGGTGAGACATATTTTCAATGCACCATCTGCAAAAAGGCGTTTCTCCGGTCATCTGATTTCGTGAAACATCAGCGGACTCATACTGGCGAAAAACCTTGTAAATGTGACTATTGTGGCAAGGGCTTTAGTGATTTTAGCGGGCTTCGGCATCACGAGAAGATCCATACCGGCGAGAAGCCATACAAGTGTCCAATCTGTGAGAAATCTTTCATCCAGCGCAGTAATTTTAACCGCCACCAACGGGTTCACACCGGTGAAAAGCCTTATAAATGCTCGCATTGTGGCAAGAGCTTCAGCTGGAGCTCCTCGCTCGATAAGCATCAACGTTCACATCTGGGGAAGAAGCCGTTCCAA SEQ ID NO: 55GGGACTCTCCGCTTACTTGAGGATTGGTGTCGGGGGATGGACATGAACCCACGTAAGGCCCTTCTTATCGCCGGGATTTCCCAGTCATGTTCAGTCGCCGAGATTGAAGAGGCGCTCCAAGCCGGGCTTGCTCCTTTAGGCGAGTATCGTCTCCTTGGGCGGATGTTTCGCCGCGATGAAAATCGCAAAGTAGCGTTGGTTGGTCTCACAGCTGAAACTAGCCATGCGCTTGTACCTAAAGAAATTCCTGGTAAAGGCGGGATCTGGCGGGTTATTTTTAAACCACCGGACCCGGACAATACGTTTCTTTCTCGTTTGAATGAGTTCCTCGCGGGCGAGGGGATGACGGTGGGGGAACTTAGTCGTGCTCTTGGTCACGAAAATGGGTCATTAGACCCTGAACAGGGTATGATTCCGGAAATGTGGGCGCCGATGCTGGCACAGGCTCTGGAGGCTCTCCAACCGGCTTTACAGTGCCTTAAGTACAAGAAGCTGCGCGTTTTTTCAGGGCGCGAGTCTCCAGAGCCGGGTGAGGAGGAATTCGGCCGTTGGATGTTCCATACCACCCAGATGATCAAAGCGTGGCAGGTGCCGGATGTCGAGAAACGCCGCCGGCTGTTGGAATCACTCCGCGGGCCGGCACTTGACGTTATTCGGGTTCTGAAAATTAACAACCCGTTAATTACGGTAGATGAATGTTTGCAAGCACTTGAAGAGGTCTTTGGGGTGACTGACAATCCTCGGGAATTGCAAGTAAAATACTTAACGACCTACCATAAGGACGAGGAGAAATTATCAGCCTACGTACTGCGGCTGGAACCGCTGCTGCAGAAGCTCGTCCAGCGGGGGGCTATTGAACGGGACGCTGTTAATCAGGCTCGCCTGGATCAGGTAATCGCTGGGGCGGTACATAAAACTATCCGCCGTGAGCTGAACCTGCCTGAAGACGGGCCGGCGCCAGGCTTTCTTCAACTCCTCGTTTTGATTAAGGATTACGAGGCAGCTGAAGAGGAGGAAGCATTACTTCAGGCCATTCTTGAAGGGAACTTTACTSEQ ID NO: 56GGGACAGAACGGCGTCGCGACGAATTAAGTGAAGAAATTAATAATCTTCGTGAAAAGGTTATGAAACAGAGTGAGGAAAACAACAATCTTCAATCCCAAGTCCAGAAACTCACTGAGGAGAATACTACACTCCGTGAGCAAGTTGAACCTACACCTGAAGATGAAGATGACGACATTGAGTTGCGGGGCGCAGCAGCCGCAGCCGCGCCTCCGCCGCCGATCGAGGAGGAATGCCCGGAGGATTTACCGGAAAAATTTGATGGTAATCCGGACATGTTAGCGCCATTCATGGCCCAGTGCCAAATTTTTATGGAAAAGTCTACGCGCGATTTTAGTGTAGATCGCGTACGTGTATGTTTTGTGACGAGCATGATGACTGGTCGCGCAGCCCGTTGGGCGTCAGCGAAATTGGAGCGGTCGCACTACCTGATGCATAATTACCCGGCGTTCATGATGGAGATGAAACACGTGTTTGAAGACCCGCAGCGGCGGGAGGTGGCCAAACGCAAGATCCGGCGGTTGCGGCAGGGCATGGGCAGCGTAATTGATTATAGTAATGCGTTTCAAATGATTGCGCAGGATCTGGATTGGAATGAACCTGCTCTCATTGATCAATATCATGAAGGGCTTAGTGACCATATTCAAGAGGAACTCTCTCACCTGGAAGTGGCTAAATCTCTCTCCGCCCTTATTGGCCAATGCATTCATATTGAGCGCCGTCTTGCACGTGCTGCTGCCGCTCGGAAACCGCGTAGTCCACCACGGGCTTTAGTGCTCCCACATATCGCGTCACACCATCAAGTAGATCCTACTGAGCCAGTGGGGGGTGCACGCATGCGCTTAACCCAAGAAGAAAAGGAACGTCGTCGTAAGCTGAATTTATGCCTGTACTGCGGCACTGGTGGCCATTATGCCGATAACTGTCCTGCCAAAGCCAGTAAGTCAAGCCCGGCTGGGAAACTTCCAGGTCCTGCCGTCGAGGGCCCTTCTGCTACCGGCCCAGAGATTATCCGCTCCCCGCAAGACGATGCGTCGTCGCCTCATCTCCAGGTAATGCTCCAAATCCACCTCCCTGGCCGGCACACACTCTTTGTCCGGGCGATGATTGACTCTGGGGCGTCTGGTAATTTTATTGATCACGAGTATGTTGCTCAAAATGGTATCCCTCTCCGGATCAAAGACTGGCCTATTCTGGTTGAAGCCATCGATGGCCGTCCGATCGCGAGCGGTCCTGTGGTTCATGAAACGCATGACCTCATCGTTGATCTGGGTGACCACCGTGAAGTATTATCCTTTGATGTGACTCAGTCACCGTTTTTTCCAGTTGTTTTGGGCGTCCGTTGGCTTTCGACTCACGATCCTAACATCACGTGGTCGACACGGTCGATTGTCTTCGATTCGGAATATTGTCGTTATCATTGCCGCATGTATTCACCAATTCCGCCGTCTCTCCCGCCGCCTGCGCCGCAACCTCCTCTGTATTACCCGGTGGACGGTTACCGTGTTTACCAGCCAGTTCGCTACTACTACGTACAAAACGTGTACACGCCTGTTGATGAACACGTGTACCCAGATCACCGCCTGGTCGACCCTCATATTGAGATGATCCCGGGTGCGCACTCGATCCCATCGGGCCATGTTTATTCCTTGTCTGAGCCAGAAATGGCCGCCTTACGGGATTTTGTGGCCCGGAATGTCAAAGACGGCCTGATTACCCCGACAATTGCACCAAACGGTGCTCAGGTGTTGCAGGTGAAGCGGGGCTGGAAGTTGCAAGTCAGCTATGATTGTCGTGCGCCAAACAACTTCACTATTCAGAACCAATATCCACGTCTCAGCATCCCTAATCTCGAGGACCAGGCACATCTTGCAACATATACTGAATTTGTACCTCAGATTCCTGGCTATCAGACTTATCCTACGTATGCTGCCTACCCAACATACCCGGTAGGTTTCGCATGGTACCCAGTAGGCCGGGACGGGCAGGGCCGCTCTTTATATGTTCCTGTCATGATTACATGGAACCCGCATTGGTACCGCCAGCCTCCGGTCCCACAGTACCCACCTCCTCAACCTCCACCACCTCCGCCGCCTCCTCCACCGCCACCTTCTTACTCGACATTA

What is claimed is:
 1. A composition comprising an endogenous Gag(endo-Gag) polypeptide and a nucleic acid molecule that encodes anantigen.
 2. The composition of claim 1, wherein the endo-Gag polypeptideassembles into an endo-Gag capsid.
 3. The composition of claim 1,wherein the nucleic acid molecule is an mRNA.
 4. The composition ofclaim 1, wherein the nucleic acid molecule comprises a modified base. 5.The composition of claim 1, further comprising a delivery component. 6.The composition of claim 5, wherein the delivery component comprises aliposome or a micelle.
 7. The composition of claim 5, wherein thedelivery component comprises a microvesicle or a viral envelope.
 8. Thecomposition of claim 5, wherein the delivery component comprises afusogenic molecule.
 9. The composition of claim 5, wherein the deliverycomponent comprises a cell-specific binding protein or an engineeredprotein that binds to an antigen or cell surface molecule.
 10. Thecomposition of claim 1, wherein the endo-Gag polypeptide is aParaneoplastic Ma antigen family polypeptide.
 11. The composition ofclaim 1, wherein the endogenous Gag polypeptide is a retrotransposonGag-like family polypeptide.
 12. The composition of claim 11, whereinthe retrotransposon Gag-like family polypeptide is a PEG10 polypeptide.13. A method of eliciting an immune response in a subject comprisingadministering to the subject a composition comprising an endo-Gagpolypeptide and a nucleic acid molecule that encodes an antigen.
 14. Themethod of claim 13, wherein the immune response comprises a B cellresponse.
 15. The method of claim 13, wherein the immune responsecomprises a T cell response.
 16. The method of claim 15, wherein the Tcell response comprises a CD4+ T cell response, a CD8+ T cell response,a Th1 immune response, a Th2 immune response, a Th17 immune response, aTreg immune response, or a combination thereof.
 17. The method of claim13, wherein the endo-Gag polypeptide is assembled into an endo-Gagcapsid.
 18. The method of claim 13, wherein the nucleic acid molecule isan mRNA.
 19. The method of claim 13, wherein the nucleic acid moleculecomprises a modified base.
 20. The method of claim 13, wherein thecomposition further comprises a delivery component.
 21. The method ofclaim 20, wherein the delivery component comprises a liposome or amicelle.
 22. The method of claim 20, wherein the delivery componentcomprises a microvesicle or a viral envelope.
 23. The method of claim20, wherein the delivery component comprises a fusogenic molecule. 24.The method of claim 20, wherein the delivery component comprises acell-specific binding protein or an engineered protein that binds to anantigen or cell surface molecule.
 25. The method of claim 24, whereinthe nucleic acid is delivered to a site of interest.
 26. The method ofclaim 13, wherein the endo-Gag polypeptide is a Paraneoplastic Maantigen family polypeptide.
 27. The method of claim 13, wherein theendo-Gag polypeptide is a retrotransposon Gag-like family polypeptide.28. The method of claim 27, wherein the retrotransposon Gag-like familypolypeptide is a PEG10 polypeptide.
 29. The method of claim 13, whereinthe nucleic acid molecule is delivered to a cell.
 30. The method ofclaim 29, wherein the cell is a muscle cell, a skin cell, a blood cell,or an immune cell.